From: Discriminating nucleosomes containing histone H2A.Z or H2A based on genetic and epigenetic information

Genetic information can also be used to predict if a nucleosome is likely to contain H2A.Z. A. Top panel is a histogram of the log-odds of all words of eight base pairs in the H2A.Z dataset compared with the H2A dataset. Bottom panel shows the same analysis carried out on the randomized dataset. B. Log odds of the most enriched and most depleted words in the H2A.Z versus the H2A dataset. C. Flexibility profile of H2A.Z- and H2A-containing nucleosome sequences. These curves represent the positional flexible dinucleotide log-odds of the flexibility models described, trained on all H2A.Z and H2A sequences including their reverse complement, using a background probability calculated on input sequences without regard to position. H2A.Z-associated sequences are slightly more rigid than their H2A counterparts. D. Classification results for some of the sequence-based classifiers investigated. True positives in light green, true negatives in dark green, false positives in light red and false negatives in dark red. GC%: A model based solely on GC content of the sequences. MarkovX: A model based on a positional Markov model of order X (see text). NPMarkovX: A model based on a non-positional Markov model of order X. Flexibility: A model based on dinucleotide flexibility (see text). SVM: A model based on a support vector machine.

