histoneHMM: Differential analysis of histone modifications with broad genomic footprints
Jazyk angličtina Země Anglie, Velká Británie Médium electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
25884684
PubMed Central
PMC4347972
DOI
10.1186/s12859-015-0491-6
PII: 10.1186/s12859-015-0491-6
Knihovny.cz E-zdroje
- MeSH
- algoritmy * MeSH
- chromatinová imunoprecipitace MeSH
- genomika metody MeSH
- histony chemie genetika metabolismus MeSH
- krysa rodu Rattus MeSH
- kvantitativní polymerázová řetězová reakce MeSH
- lidé MeSH
- Markovovy řetězce MeSH
- myši MeSH
- posttranslační úpravy proteinů * MeSH
- software * MeSH
- výpočetní biologie metody MeSH
- vysoce účinné nukleotidové sekvenování metody MeSH
- zvířata MeSH
- Check Tag
- krysa rodu Rattus MeSH
- lidé MeSH
- mužské pohlaví MeSH
- myši MeSH
- ženské pohlaví MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- histony MeSH
BACKGROUND: ChIP-seq has become a routine method for interrogating the genome-wide distribution of various histone modifications. An important experimental goal is to compare the ChIP-seq profiles between an experimental sample and a reference sample, and to identify regions that show differential enrichment. However, comparative analysis of samples remains challenging for histone modifications with broad domains, such as heterochromatin-associated H3K27me3, as most ChIP-seq algorithms are designed to detect well defined peak-like features. RESULTS: To address this limitation we introduce histoneHMM, a powerful bivariate Hidden Markov Model for the differential analysis of histone modifications with broad genomic footprints. histoneHMM aggregates short-reads over larger regions and takes the resulting bivariate read counts as inputs for an unsupervised classification procedure, requiring no further tuning parameters. histoneHMM outputs probabilistic classifications of genomic regions as being either modified in both samples, unmodified in both samples or differentially modified between samples. We extensively tested histoneHMM in the context of two broad repressive marks, H3K27me3 and H3K9me3, and evaluated region calls with follow up qPCR as well as RNA-seq data. Our results show that histoneHMM outperforms competing methods in detecting functionally relevant differentially modified regions. CONCLUSION: histoneHMM is a fast algorithm written in C++ and compiled as an R package. It runs in the popular R computing environment and thus seamlessly integrates with the extensive bioinformatic tool sets available through Bioconductor. This makeshistoneHMM an attractive choice for the differential analysis of ChIP-seq data. Software is available from http://histonehmm.molgen.mpg.de .
Zobrazit více v PubMed
Kouzarides T. Chromatin modifications and their function. Cell. 2007;128(4):693–705. doi: 10.1016/j.cell.2007.02.005. PubMed DOI
Beck DB, Oda H, Shen SS, Reinberg D. PR-Set7 and H4K20me1: at the crossroads of genome integrity, cell cycle, chromosome condensation, and transcription. Genes Dev. 2012;26(4):325–37. doi: 10.1101/gad.177444.111. PubMed DOI PMC
Huda A, Mariño-Ramírez L, Jordan IK. Epigenetic histone modifications of human transposable elements: genome defense versus exaptation. Mob DNA. 2010;1(1):2. doi: 10.1186/1759-8753-1-2. PubMed DOI PMC
Pengelly AR, Ömer C, Jäckle H, Herzig A, Müller J. A histone mutant reproduces the phenotype caused by loss of histone-modifying factor Polycomb. Science. 2013;339(6120):698–9. doi: 10.1126/science.1231382. PubMed DOI
Chi P, Allis CD, Wang GG. Covalent histone modifications–miswritten, misinterpreted and mis-erased in human cancers. Nat Rev Cancer. 2010;10(7):457–69. doi: 10.1038/nrc2876. PubMed DOI PMC
Peleg S, Sananbenesi F, Zovoilis A, Burkhardt S, Bahari-Javan S, Agis-Balboa RC, et al. Altered histone acetylation is associated with age-dependent memory impairment in mice. Science. 2010;328(5979):753–6. doi: 10.1126/science.1186088. PubMed DOI
Reuter S, Gupta SC, Park B, Goel A, Aggarwal BB. Epigenetic changes induced by curcumin and other natural compounds. Genes Nutr. 2011;6(2):93–108. doi: 10.1007/s12263-011-0222-1. PubMed DOI PMC
Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10(10):669–80. doi: 10.1038/nrg2641. PubMed DOI PMC
Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):106. doi: 10.1186/gb-2010-11-10-r106. PubMed DOI PMC
Shao Z, Zhang Y, Yuan G-C, Orkin S, Waxman D. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets. Genome Biol. 2012;13(3):16. doi: 10.1186/gb-2012-13-3-r16. PubMed DOI PMC
Beisel C, Paro R. Silencing chromatin: comparing modes and mechanisms. Nat Rev Genet. 2011;12(2):123–35. doi: 10.1038/nrg2932. PubMed DOI
Barski A, Cuddapah S, Cui K, Roh T-Y, Schones DE, Wang Z, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129(4):823–37. doi: 10.1016/j.cell.2007.05.009. PubMed DOI
Mikkelsen T, Ku M, Jaffe D, Issac B, Lieberman E, Giannoukos G, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553–60. doi: 10.1038/nature06008. PubMed DOI PMC
Okamoto K. Spontaneous Hypertension: Its Pathogenesis and Complications. Dordrecht Heidelberg London New York: Springer; 1972.
Rintisch C, Heinig M, Bauerfeind A, Schafer S, Mieth C, Patone G, et al. Natural variation of histone modification and its impact on gene expression in the rat genome. Genome Res. 2014;24(6):942–53. doi: 10.1101/gr.169029.113. PubMed DOI PMC
Sugathan A, Waxman DJ. Genome-wide analysis of chromatin states reveals distinct mechanisms of sex-dependent gene regulation in male and female mouse liver. Mol Cell Biol. 2013;33(18):3594–610. doi: 10.1128/MCB.00280-13. PubMed DOI PMC
E.N.C.O.D.E Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74. doi: 10.1038/nature11247. PubMed DOI PMC
Shen L, Shao N-Y, Liu X, Maze I, Feng J, Nestler EJ. diffReps: detecting differential chromatin modification sites from ChIP-seq data with biological replicates. PLoS One. 2013;8(6):65598. doi: 10.1371/journal.pone.0065598. PubMed DOI PMC
Xu H, Wei C-L, Lin F, Sung W-K. An HMM approach to genome-wide identification of differential histone modification sites from ChIP-seq data. Bioinformatics. 2008;24(20):2344–9. doi: 10.1093/bioinformatics/btn402. PubMed DOI
Zhang Y, Lin Y-H, Johnson TD, Rozek LS, Sartor MA. PePr: a peak-calling prioritization pipeline to identify consistent or differential peaks from replicated ChIP-Seq data. Bioinformatics. 2014;30(18):2568–75. doi: 10.1093/bioinformatics/btu372. PubMed DOI PMC
Song Q, Smith AD. Identifying dispersed epigenomic domains from ChIP-Seq data. Bioinformatics. 2011;27(6):870–1. doi: 10.1093/bioinformatics/btr030. PubMed DOI PMC
Dwinell MR, Worthey EA, Shimoyama M, Bakir-Gungor B, DePons J, Laulederkind S, et al. The rat genome database 2009: variation, ontologies and pathways. Nucleic Acids Res. 2009;37(suppl 1):744–9. doi: 10.1093/nar/gkn842. PubMed DOI PMC
Augui S, Nora EP, Heard E. Regulation of X-chromosome inactivation by the X-inactivation centre. Nat Rev Genet. 2011;12(6):429–42. doi: 10.1038/nrg2987. PubMed DOI
Lyons MF. Gene action in the X-chromosome of the mouse (Mus musculus L.) Nature. 1961;190:372–3. doi: 10.1038/190372a0. PubMed DOI
Yang F, Babak T, Shendure J, Disteche CM. Global survey of escape from X inactivation by RNA-sequencing in mouse. Genome Res. 2010;20(5):614–22. doi: 10.1101/gr.103200.109. PubMed DOI PMC
Rinn JL, Rozowsky JS, Laurenzi IJ, Petersen PH, Zou K, Zhong W, et al. Major molecular differences between mammalian sexes are involved in drug metabolism and renal function. Dev Cell. 2004;6(6):791–800. doi: 10.1016/j.devcel.2004.05.005. PubMed DOI
Ramirez MC, Luque GM, Ornstein AM, Becu-Villalobos D. Differential neonatal testosterone imprinting of GH-dependent liver proteins and genes in female mice. J Endocrinol. 2010;207(3):301–8. doi: 10.1677/JOE-10-0276. PubMed DOI
Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS) Genome Biol. 2008;9(9):137. doi: 10.1186/gb-2008-9-9-r137. PubMed DOI PMC
Kharchenko PV, Tolstorukov MY, Park PJ. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol. 2008;26(12):1351–9. doi: 10.1038/nbt.1508. PubMed DOI PMC
Spyrou C, Stark R, Lynch AG, Tavaré S. BayesPeak: Bayesian analysis of ChIP-seq data. BMC Bioinformatics. 2009;10:299. doi: 10.1186/1471-2105-10-299. PubMed DOI PMC
Qin ZS, Yu J, Shen J, Maher CA, Hu M, Kalyana-Sundaram S, et al. HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data. BMC Bioinformatics. 2010;11:369. doi: 10.1186/1471-2105-11-369. PubMed DOI PMC
Cairns J, Spyrou C, Stark R, Smith ML, Lynch AG, Tavaré S. BayesPeak–an R package for analysing ChIP-seq data. Bioinformatics. 2011;27(5):713–4. doi: 10.1093/bioinformatics/btq685. PubMed DOI PMC
Rashid NU, Giresi PG, Ibrahim JG, Sun W, Lieb JD. ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions. Genome Biol. 2011;12(7):67. doi: 10.1186/gb-2011-12-7-r67. PubMed DOI PMC
Zang C, Schones DE, Zeng C, Cui K, Zhao K, Peng W. A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics. 2009;25(15):1952–8. doi: 10.1093/bioinformatics/btp340. PubMed DOI PMC
Wang J, Lunyak VV, Jordan IK. BroadPeak: a novel algorithm for identifying broad peaks in diffuse ChIP-seq datasets. Bioinformatics. 2013;29(4):492–3. doi: 10.1093/bioinformatics/bts722. PubMed DOI
Micsinai M, Parisi F, Strino F, Asp P, Dynlacht BD, Kluger Y. Picking ChIP-seq peak detectors for analyzing chromatin modification experiments. Nucleic Acids Res. 2012;40(9):70. doi: 10.1093/nar/gks048. PubMed DOI PMC
Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, et al. ArrayExpress–a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 2007;35:747–50. doi: 10.1093/nar/gkl995. PubMed DOI PMC
Atanur SS, Birol I, Guryev V, Hirst M, Hummel O, Morrissey C, et al.The genome sequence of the spontaneously hypertensive rat: Analysis and functional significance. Genome Res. 2010. doi:10.1101/gr.103499.109. PubMed PMC
Bernstein BE, Stamatoyannopoulos JA, Costello JF, Ren B, Milosavljevic A, Meissner A, et al. The NIH roadmap epigenomics mapping consortium. Nat Biotechnol. 2010;28(10):1045–8. doi: 10.1038/nbt1010-1045. PubMed DOI PMC
Carroll TS, Liang Z, Salama R, Stark R, de Santiago I. Impact of artifact removal on ChIP quality metrics in ChIP-seq and ChIP-exo data. Front Genet. 2014;5:75. doi: 10.3389/fgene.2014.00075. PubMed DOI PMC
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-seq. Bioinformatics. 2009;25(9):1105–11. doi: 10.1093/bioinformatics/btp120. PubMed DOI PMC
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40. doi: 10.1093/bioinformatics/btp616. PubMed DOI PMC
Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodological) 1977;39(1):1–38.
Baum L, Petrie T, Soules G, Weiss N. A maximization technique occurring in the statistical analysis of probabilistic functions of markov chains. Ann Math Stat. 1970;41:164–71. doi: 10.1214/aoms/1177697196. DOI
Rabiner L. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 1989;77:257–86. doi: 10.1109/5.18626. DOI
Sklar A. Fonctions de répartition à n dimensions et leurs marges. Publ Inst Statist Univ Paris. 1959;8(1):11.
Nelsen R. An Introduction to Copulas. New York: Springer; 2006.
Genz A, Bretz F. Computation of Multivariate Normal and T Probabilities. Dordrecht Heidelberg London New York: Springer; 2009.