histoneHMM: Differential analysis of histone modifications with broad genomic footprints

. 2015 Feb 22 ; 16 () : 60. [epub] 20150222

Jazyk angličtina Země Anglie, Velká Británie Médium electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid25884684
Odkazy

PubMed 25884684
PubMed Central PMC4347972
DOI 10.1186/s12859-015-0491-6
PII: 10.1186/s12859-015-0491-6
Knihovny.cz E-zdroje

BACKGROUND: ChIP-seq has become a routine method for interrogating the genome-wide distribution of various histone modifications. An important experimental goal is to compare the ChIP-seq profiles between an experimental sample and a reference sample, and to identify regions that show differential enrichment. However, comparative analysis of samples remains challenging for histone modifications with broad domains, such as heterochromatin-associated H3K27me3, as most ChIP-seq algorithms are designed to detect well defined peak-like features. RESULTS: To address this limitation we introduce histoneHMM, a powerful bivariate Hidden Markov Model for the differential analysis of histone modifications with broad genomic footprints. histoneHMM aggregates short-reads over larger regions and takes the resulting bivariate read counts as inputs for an unsupervised classification procedure, requiring no further tuning parameters. histoneHMM outputs probabilistic classifications of genomic regions as being either modified in both samples, unmodified in both samples or differentially modified between samples. We extensively tested histoneHMM in the context of two broad repressive marks, H3K27me3 and H3K9me3, and evaluated region calls with follow up qPCR as well as RNA-seq data. Our results show that histoneHMM outperforms competing methods in detecting functionally relevant differentially modified regions. CONCLUSION: histoneHMM is a fast algorithm written in C++ and compiled as an R package. It runs in the popular R computing environment and thus seamlessly integrates with the extensive bioinformatic tool sets available through Bioconductor. This makeshistoneHMM an attractive choice for the differential analysis of ChIP-seq data. Software is available from http://histonehmm.molgen.mpg.de .

Zobrazit více v PubMed

Kouzarides T. Chromatin modifications and their function. Cell. 2007;128(4):693–705. doi: 10.1016/j.cell.2007.02.005. PubMed DOI

Beck DB, Oda H, Shen SS, Reinberg D. PR-Set7 and H4K20me1: at the crossroads of genome integrity, cell cycle, chromosome condensation, and transcription. Genes Dev. 2012;26(4):325–37. doi: 10.1101/gad.177444.111. PubMed DOI PMC

Huda A, Mariño-Ramírez L, Jordan IK. Epigenetic histone modifications of human transposable elements: genome defense versus exaptation. Mob DNA. 2010;1(1):2. doi: 10.1186/1759-8753-1-2. PubMed DOI PMC

Pengelly AR, Ömer C, Jäckle H, Herzig A, Müller J. A histone mutant reproduces the phenotype caused by loss of histone-modifying factor Polycomb. Science. 2013;339(6120):698–9. doi: 10.1126/science.1231382. PubMed DOI

Chi P, Allis CD, Wang GG. Covalent histone modifications–miswritten, misinterpreted and mis-erased in human cancers. Nat Rev Cancer. 2010;10(7):457–69. doi: 10.1038/nrc2876. PubMed DOI PMC

Peleg S, Sananbenesi F, Zovoilis A, Burkhardt S, Bahari-Javan S, Agis-Balboa RC, et al. Altered histone acetylation is associated with age-dependent memory impairment in mice. Science. 2010;328(5979):753–6. doi: 10.1126/science.1186088. PubMed DOI

Reuter S, Gupta SC, Park B, Goel A, Aggarwal BB. Epigenetic changes induced by curcumin and other natural compounds. Genes Nutr. 2011;6(2):93–108. doi: 10.1007/s12263-011-0222-1. PubMed DOI PMC

Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10(10):669–80. doi: 10.1038/nrg2641. PubMed DOI PMC

Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):106. doi: 10.1186/gb-2010-11-10-r106. PubMed DOI PMC

Shao Z, Zhang Y, Yuan G-C, Orkin S, Waxman D. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets. Genome Biol. 2012;13(3):16. doi: 10.1186/gb-2012-13-3-r16. PubMed DOI PMC

Beisel C, Paro R. Silencing chromatin: comparing modes and mechanisms. Nat Rev Genet. 2011;12(2):123–35. doi: 10.1038/nrg2932. PubMed DOI

Barski A, Cuddapah S, Cui K, Roh T-Y, Schones DE, Wang Z, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129(4):823–37. doi: 10.1016/j.cell.2007.05.009. PubMed DOI

Mikkelsen T, Ku M, Jaffe D, Issac B, Lieberman E, Giannoukos G, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553–60. doi: 10.1038/nature06008. PubMed DOI PMC

Okamoto K. Spontaneous Hypertension: Its Pathogenesis and Complications. Dordrecht Heidelberg London New York: Springer; 1972.

Rintisch C, Heinig M, Bauerfeind A, Schafer S, Mieth C, Patone G, et al. Natural variation of histone modification and its impact on gene expression in the rat genome. Genome Res. 2014;24(6):942–53. doi: 10.1101/gr.169029.113. PubMed DOI PMC

Sugathan A, Waxman DJ. Genome-wide analysis of chromatin states reveals distinct mechanisms of sex-dependent gene regulation in male and female mouse liver. Mol Cell Biol. 2013;33(18):3594–610. doi: 10.1128/MCB.00280-13. PubMed DOI PMC

E.N.C.O.D.E Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. doi: 10.1038/nature11247. PubMed DOI PMC

Shen L, Shao N-Y, Liu X, Maze I, Feng J, Nestler EJ. diffReps: detecting differential chromatin modification sites from ChIP-seq data with biological replicates. PLoS One. 2013;8(6):65598. doi: 10.1371/journal.pone.0065598. PubMed DOI PMC

Xu H, Wei C-L, Lin F, Sung W-K. An HMM approach to genome-wide identification of differential histone modification sites from ChIP-seq data. Bioinformatics. 2008;24(20):2344–9. doi: 10.1093/bioinformatics/btn402. PubMed DOI

Zhang Y, Lin Y-H, Johnson TD, Rozek LS, Sartor MA. PePr: a peak-calling prioritization pipeline to identify consistent or differential peaks from replicated ChIP-Seq data. Bioinformatics. 2014;30(18):2568–75. doi: 10.1093/bioinformatics/btu372. PubMed DOI PMC

Song Q, Smith AD. Identifying dispersed epigenomic domains from ChIP-Seq data. Bioinformatics. 2011;27(6):870–1. doi: 10.1093/bioinformatics/btr030. PubMed DOI PMC

Dwinell MR, Worthey EA, Shimoyama M, Bakir-Gungor B, DePons J, Laulederkind S, et al. The rat genome database 2009: variation, ontologies and pathways. Nucleic Acids Res. 2009;37(suppl 1):744–9. doi: 10.1093/nar/gkn842. PubMed DOI PMC

Augui S, Nora EP, Heard E. Regulation of X-chromosome inactivation by the X-inactivation centre. Nat Rev Genet. 2011;12(6):429–42. doi: 10.1038/nrg2987. PubMed DOI

Lyons MF. Gene action in the X-chromosome of the mouse (Mus musculus L.) Nature. 1961;190:372–3. doi: 10.1038/190372a0. PubMed DOI

Yang F, Babak T, Shendure J, Disteche CM. Global survey of escape from X inactivation by RNA-sequencing in mouse. Genome Res. 2010;20(5):614–22. doi: 10.1101/gr.103200.109. PubMed DOI PMC

Rinn JL, Rozowsky JS, Laurenzi IJ, Petersen PH, Zou K, Zhong W, et al. Major molecular differences between mammalian sexes are involved in drug metabolism and renal function. Dev Cell. 2004;6(6):791–800. doi: 10.1016/j.devcel.2004.05.005. PubMed DOI

Ramirez MC, Luque GM, Ornstein AM, Becu-Villalobos D. Differential neonatal testosterone imprinting of GH-dependent liver proteins and genes in female mice. J Endocrinol. 2010;207(3):301–8. doi: 10.1677/JOE-10-0276. PubMed DOI

Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9(9):137. doi: 10.1186/gb-2008-9-9-r137. PubMed DOI PMC

Kharchenko PV, Tolstorukov MY, Park PJ. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol. 2008;26(12):1351–9. doi: 10.1038/nbt.1508. PubMed DOI PMC

Spyrou C, Stark R, Lynch AG, Tavaré S. BayesPeak: Bayesian analysis of ChIP-seq data. BMC Bioinformatics. 2009;10:299. doi: 10.1186/1471-2105-10-299. PubMed DOI PMC

Qin ZS, Yu J, Shen J, Maher CA, Hu M, Kalyana-Sundaram S, et al. HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data. BMC Bioinformatics. 2010;11:369. doi: 10.1186/1471-2105-11-369. PubMed DOI PMC

Cairns J, Spyrou C, Stark R, Smith ML, Lynch AG, Tavaré S. BayesPeak–an R package for analysing ChIP-seq data. Bioinformatics. 2011;27(5):713–4. doi: 10.1093/bioinformatics/btq685. PubMed DOI PMC

Rashid NU, Giresi PG, Ibrahim JG, Sun W, Lieb JD. ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions. Genome Biol. 2011;12(7):67. doi: 10.1186/gb-2011-12-7-r67. PubMed DOI PMC

Zang C, Schones DE, Zeng C, Cui K, Zhao K, Peng W. A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics. 2009;25(15):1952–8. doi: 10.1093/bioinformatics/btp340. PubMed DOI PMC

Wang J, Lunyak VV, Jordan IK. BroadPeak: a novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets. Bioinformatics. 2013;29(4):492–3. doi: 10.1093/bioinformatics/bts722. PubMed DOI

Micsinai M, Parisi F, Strino F, Asp P, Dynlacht BD, Kluger Y. Picking ChIP-seq peak detectors for analyzing chromatin modification experiments. Nucleic Acids Res. 2012;40(9):70. doi: 10.1093/nar/gks048. PubMed DOI PMC

Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, et al. ArrayExpress–a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 2007;35:747–50. doi: 10.1093/nar/gkl995. PubMed DOI PMC

Atanur SS, Birol I, Guryev V, Hirst M, Hummel O, Morrissey C, et al.The genome sequence of the spontaneously hypertensive rat: Analysis and functional significance. Genome Res. 2010. doi:10.1101/gr.103499.109. PubMed PMC

Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, et al. The NIH roadmap epigenomics mapping consortium. Nat Biotechnol. 2010;28(10):1045–8. doi: 10.1038/nbt1010-1045. PubMed DOI PMC

Carroll TS, Liang Z, Salama R, Stark R, de Santiago I. Impact of artifact removal on ChIP quality metrics in ChIP-seq and ChIP-exo data. Front Genet. 2014;5:75. doi: 10.3389/fgene.2014.00075. PubMed DOI PMC

Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-seq. Bioinformatics. 2009;25(9):1105–11. doi: 10.1093/bioinformatics/btp120. PubMed DOI PMC

Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40. doi: 10.1093/bioinformatics/btp616. PubMed DOI PMC

Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 1977;39(1):1–38.

Baum L, Petrie T, Soules G, Weiss N. A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann Math Stat. 1970;41:164–71. doi: 10.1214/aoms/1177697196. DOI

Rabiner L. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 1989;77:257–86. doi: 10.1109/5.18626. DOI

Sklar A. Fonctions de répartition à n dimensions et leurs marges. Publ Inst Statist Univ Paris. 1959;8(1):11.

Nelsen R. An Introduction to Copulas. New York: Springer; 2006.

Genz A, Bretz F. Computation of Multivariate Normal and T Probabilities. Dordrecht Heidelberg London New York: Springer; 2009.

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...