Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rate

. 2018 Dec ; 28 (12) : 1767-1778. [epub] 20181106

Jazyk angličtina Země Spojené státy americké Médium print-electronic

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/pmid30401733

DNA conformation may deviate from the classical B-form in ∼13% of the human genome. Non-B DNA regulates many cellular processes; however, its effects on DNA polymerization speed and accuracy have not been investigated genome-wide. Such an inquiry is critical for understanding neurological diseases and cancer genome instability. Here, we present the first simultaneous examination of DNA polymerization kinetics and errors in the human genome sequenced with Single-Molecule Real-Time (SMRT) technology. We show that polymerization speed differs between non-B and B-DNA: It decelerates at G-quadruplexes and fluctuates periodically at disease-causing tandem repeats. Analyzing polymerization kinetics profiles, we predict and validate experimentally non-B DNA formation for a novel motif. We demonstrate that several non-B motifs affect sequencing errors (e.g., G-quadruplexes increase error rates), and that sequencing errors are positively associated with polymerase slowdown. Finally, we show that highly divergent G4 motifs have pronounced polymerization slowdown and high sequencing error rates, suggesting similar mechanisms for sequencing errors and germline mutations.

Zobrazit více v PubMed

The 1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature 526: 68–74. 10.1038/nature15393 PubMed DOI PMC

Aitchison J. 1986. The statistical analysis of compositional data. Chapman and Hall, New York.

Ananda G, Hile SE, Breski A, Wang Y, Kelkar Y, Makova KD, Eckert KA. 2014. Microsatellite interruptions stabilize primate genomes and exist as population-specific single nucleotide polymorphisms within individual human genomes. PLoS Genet 10: e1004498 10.1371/journal.pgen.1004498 PubMed DOI PMC

Bacolla A, Wells RD. 2004. Non-B DNA conformations, genomic rearrangements, and human disease. J Biol Chem 279: 47411–47414. 10.1074/jbc.R400028200 PubMed DOI

Bacolla A, Jaworski A, Larson JE, Jakupciak JP, Chuzhanova N, Abeysinghe SS, O'Connell CD, Cooper DN, Wells RD. 2004. Breakpoints of gross deletions coincide with non-B DNA conformations. Proc Natl Acad Sci 101: 14162–14167. 10.1073/pnas.0405974101 PubMed DOI PMC

Bacolla A, Wang G, Jain A, Chuzhanova NA, Cer RZ, Collins JR, Cooper DN, Bohr VA, Vasquez KM. 2011. Non-B DNA-forming sequences and WRN deficiency independently increase the frequency of base substitution in human cells. J Biol Chem 286: 10017–10026. 10.1074/jbc.M110.176636 PubMed DOI PMC

Balasubramanian S, Hurley LH, Neidle S. 2011. Targeting G-quadruplexes in gene promoters: a novel anticancer strategy? Nat Rev Drug Discov 10: 261–275. 10.1038/nrd3428 PubMed DOI PMC

Belotserkovskii BP, Liu R, Tornaletti S, Krasilnikova MM, Mirkin SM, Hanawalt PC. 2010. Mechanisms and implications of transcription blockage by guanine-rich DNA sequences. Proc Natl Acad Sci 107: 12816–12821. 10.1073/pnas.1007580107 PubMed DOI PMC

Besnard E, Babled A, Lapasset L, Milhavet O, Parrinello H, Dantec C, Marin JM, Lemaitre JM. 2012. Unraveling cell type–specific and reprogrammable human replication origin signatures associated with G-quadruplex consensus motifs. Nat Struct Mol Biol 19: 837–844. 10.1038/nsmb.2339 PubMed DOI

Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. 2004. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14: 708–715. 10.1101/gr.1933104 PubMed DOI PMC

Campos-Sánchez R, Cremona MA, Pini A, Chiaromonte F, Makova KD. 2016. Integration and fixation preferences of human and mouse endogenous retroviruses uncovered with functional data analysis. PLoS Comput Biol 12: e1004956 10.1371/journal.pcbi.1004956 PubMed DOI PMC

Castel AL, Cleary JD, Pearson CE. 2010. Repeat instability as the basis for human diseases and as a potential target for therapy. Nat Rev Mol Cell Biol 11: 165–170. 10.1038/nrm2854 PubMed DOI

Cer RZ, Donohue DE, Mudunuri US, Temiz NA, Loss MA, Starner NJ, Halusa GN, Volfovsky N, Yi M, Luke BT, et al. 2012. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools. Nucleic Acids Res 41: D94–D100. 10.1093/nar/gks955 PubMed DOI PMC

Chambers VS, Marsico G, Boutell JM, Di Antonio M, Smith GP, Balasubramanian S. 2015. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat Biotechnol 33: 877–881. 10.1038/nbt.3295 PubMed DOI

Cooper DN, Bacolla A, Férec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. 2011. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum Mutat 32: 1075–1099. 10.1002/humu.21557 PubMed DOI PMC

Cremona MA, Pini A, Cumbo F, Makova KD, Chiaromonte F, Vantini S. 2018. IWTomics: testing high-resolution sequence-based ‘Omics’ data at multiple locations and scales. Bioinformatics 34: 2289–2291. 10.1093/bioinformatics/bty090 PubMed DOI PMC

Dailey MM, Miller MC, Bates PJ, Lane AN, Trent JO. 2010. Resolution and characterization of the structural polymorphism of a single quadruplex-forming sequence. Nucleic Acids Res 38: 4877–4888. 10.1093/nar/gkq166 PubMed DOI PMC

Delagoutte E, Goellner GM, Guo J, Baldacci G, McMurray CT. 2008. Single-stranded DNA-binding protein in vitro eliminates the orientation-dependent impediment to polymerase passage on CAG/CTG repeats. J Biol Chem 283: 13341–13356. 10.1074/jbc.M800153200 PubMed DOI PMC

Dolinnaya NG, Ogloblina AM, Yakubovskaya MG. 2016. Structure, properties, and biological relevance of the DNA and RNA G-quadruplexes: overview 50 years after their discovery. Biochemistry (Mosc) 81: 1602–1649. 10.1134/S0006297916130034 PubMed DOI PMC

Du X, Gertz EM, Wojtowicz D, Zhabinskaya D, Levens D, Benham CJ, Schäffer AA, Przytycka TM. 2014. Potential non-B DNA regions in the human genome are associated with higher rates of nucleotide mutation and expression variation. Nucleic Acids Res 42: 12367–12379. 10.1093/nar/gku921 PubMed DOI PMC

Eddy S, Maddukuri L, Ketkar A, Zafar MK, Henninger EE, Pursell ZF, Eoff RL. 2015. Evidence for the kinetic partitioning of polymerase activity on G-quadruplex DNA. Biochemistry 54: 3218–3230. 10.1021/acs.biochem.5b00060 PubMed DOI PMC

Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, et al. 2009. Real-time DNA sequencing from single polymerase molecules. Science 323: 133–138. 10.1126/science.1162986 PubMed DOI

Flusberg BA, Webster DR, Lee JH, Travers KJ, Olivares EC, Clark TA, Korlach J, Turner SW. 2010. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat Methods 7: 461–465. 10.1038/nmeth.1459 PubMed DOI PMC

Fry M, Loeb LA. 1992. A DNA polymerase α pause site is a hot spot for nucleotide misinsertion. Proc Natl Acad Sci 89: 763–767. 10.1073/pnas.89.2.763 PubMed DOI PMC

Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate typing of short tandem repeats from genome-wide sequencing data and its applications. Genome Res 25: 736–749. 10.1101/gr.185892.114 PubMed DOI PMC

Haberman Y, Amariglio N, Rechavi G, Eisenberg E. 2008. Trinucleotide repeats are prevalent among cancer-related genes. Trends Genet 24: 14–18. 10.1016/j.tig.2007.09.005 PubMed DOI

Harris RS. 2007. “Improved pairwise alignment of genomic DNA.” PhD thesis, Pennsylvania State University.

Hile SE, Eckert KA. 2004. Positive correlation between DNA polymerase α-primase pausing and mutagenesis within polypyrimidine/polypurine microsatellite sequences. J Mol Biol 335: 745–759. 10.1016/j.jmb.2003.10.075 PubMed DOI

Hodgkinson A, Eyre-Walker A. 2011. Variation in the mutation rate across mammalian genomes. Nat Rev Genet 12: 756–766. 10.1038/nrg3098 PubMed DOI

Huppert JL, Balasubramanian S. 2005. Prevalence of quadruplexes in the human genome. Nucleic Acids Res 33: 2908–2916. 10.1093/nar/gki609 PubMed DOI PMC

Inagaki H, Ohye T, Kogo H, Tsutsumi M, Kato T, Tong M, Emanuel BS, Kurahashi H. 2013. Two sequential cleavage reactions on cruciform DNA structures cause palindrome-mediated chromosomal translocations. Nat Commun 4: 1592 10.1038/ncomms2595 PubMed DOI

Jansen A, van der Zande E, Meert W, Fink GR, Verstrepen KJ. 2012. Distal chromatin structure influences local nucleosome positions and gene expression. Nucleic Acids Res 40: 3870–3885. 10.1093/nar/gkr1311 PubMed DOI PMC

Kamat MA, Bacolla A, Cooper DN, Chuzhanova N. 2016. A role for non-B DNA forming sequences in mediating microlesions causing human inherited disease. Hum Mutat 37: 65–73. 10.1002/humu.22917 PubMed DOI

Kang S, Ohshima K, Shimizu M, Amirhaeri S, Wells RD. 1995. Pausing of DNA synthesis in vitro at specific loci in CTG and CGG triplet repeats from human hereditary disease genes. J Biol Chem 270: 27014–27021. 10.1074/jbc.270.45.27014 PubMed DOI

Kejnovská I, Bednárová K, Renciuk D, Dvoráková Z, Školáková P, Trantírek L, Fiala R, Vorlícková M, Sagi J. 2017. Clustered abasic lesions profoundly change the structure and stability of human telomeric G-quadruplexes. Nucleic Acids Res 45: 4294–4305. 10.1093/nar/gkx191 PubMed DOI PMC

Krasilnikova MM, Mirkin SM. 2004. Replication stalling at Friedreich's Ataxia (GAA)n repeats in vivo. Mol Cell Biol 24: 2286–2295. 10.1128/MCB.24.6.2286-2295.2004 PubMed DOI PMC

Kypr J, Kejnovská I, Renciuk D, Vorlícková M. 2009. Circular dichroism and conformational polymorphism of DNA. Nucleic Acids Res 37: 1713–1725. 10.1093/nar/gkp026 PubMed DOI PMC

Lachenbruch PA. 1976. Analysis of data with clumping at zero. Biom Z 18: 351–356.

Locke DP, Hillier LW, Warren WC, Worley KC, Nazareth LV, Muzny DM, Yang SP, Wang Z, Chinwalla AT, Minx P, et al. 2011. Comparative and demographic analysis of orang-utan genomes. Nature 469: 529–533. 10.1038/nature09687 PubMed DOI PMC

Loomis EW, Eid JS, Peluso P, Yin J, Hickey L, Rank D, McCalmon S, Hagerman RJ, Tassone F, Hagerman PJ. 2013. Sequencing the unsequenceable: expanded CGG-repeat alleles of the fragile X gene. Genome Res 23: 121–128. 10.1101/gr.141705.112 PubMed DOI PMC

Maizels N. 2015. G4-associated human diseases. EMBO Rep 16: 910–922. 10.15252/embr.201540607 PubMed DOI PMC

Makova KD, Hardison RC. 2015. The effects of chromatin organization on variation in mutation rates in the genome. Nat Rev Genet 16: 213–223. 10.1038/nrg3890 PubMed DOI PMC

Makova KD, Li WH. 2002. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416: 624–626. 10.1038/416624a PubMed DOI

Mirkin SM. 2007. Expandable DNA repeats and human disease. Nature 447: 932–940. 10.1038/nature05977 PubMed DOI

Mirkin SM. 2008. Discovery of alternative DNA structures: a heroic decade (1979-1989). Front Biosci 13: 1064–1071. 10.2741/2744 PubMed DOI

Mirkin EV, Mirkin SM. 2007. Replication fork stalling at natural impediments. Microbiol Mol Biol Rev 71: 13–35. 10.1128/MMBR.00030-06 PubMed DOI PMC

Mirkin SM, Lyamichev VI, Drushlyak KN, Dobrynin VN, Filippov SA, Frank-Kamenetskii MD. 1987. DNA H form requires a homopurine–homopyrimidine mirror repeat. Nature 330: 495–497. 10.1038/330495a0 PubMed DOI

Nadel Y, Weisman-Shomer P, Fry M. 1995. The fragile X syndrome single strand d(CGG)n nucleotide repeats readily fold back to form unimolecular hairpin structures. J Biol Chem 270: 28970–28977. 10.1074/jbc.270.48.28970 PubMed DOI

Neidle S, Balasubramanian S. 2006. Quadruplex nucleic acids. Royal Society of Chemistry, London.

Parkinson GN, Lee MP, Neidle S. 2002. Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417: 876–880. 10.1038/nature755 PubMed DOI

Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R. 2015. Modeling and analysis of compositional data. John Wiley & Sons, Chichester, UK.

Pini A, Vantini S. 2017. Interval-wise testing for functional data. J Nonparametr Stat 29: 407–424.

Renton AE, Majounie E, Waite A, Simón-Sánchez J, Rollinson S, Gibbs JR, Schymick JC, Laaksovirta H, van Swieten JC, Myllykangas L, et al. 2011. A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 72: 257–268. 10.1016/j.neuron.2011.09.010 PubMed DOI PMC

Rhoads A, Au KF. 2015. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13: 278–289. 10.1016/j.gpb.2015.08.002 PubMed DOI PMC

Samadashwily GM, Raca G, Mirkin SM. 1997. Trinucleotide repeats affect DNA replication in vivo. Nat Genet 17: 298–304. 10.1038/ng1197-298 PubMed DOI

Sawaya S, Bagshaw A, Buschiazzo E, Kumar P, Chowdhury S, Black MA, Gemmell N. 2013. Microsatellite tandem repeats are abundant in human promoters and are associated with regulatory elements. PLoS One 8: e54710 10.1371/journal.pone.0054710 PubMed DOI PMC

Sawaya S, Boocock J, Black MA, Gemmell NJ. 2015. Exploring possible DNA structures in real-time polymerase kinetics using Pacific Biosciences sequencer data. BMC Bioinformatics 16: 21 10.1186/s12859-014-0449-0 PubMed DOI PMC

Schadt EE, Banerjee O, Fang G, Feng Z, Wong WH, Zhang X, Kislyuk A, Clark TA, Luong K, Keren-Paz A, et al. 2013. Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases. Genome Res 23: 129–141. 10.1101/gr.136739.111 PubMed DOI PMC

Sen D, Gilbert W. 1988. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature 334: 364–366. 10.1038/334364a0 PubMed DOI

Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH. 2002. Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci 99: 11593–11598. 10.1073/pnas.182256799 PubMed DOI PMC

Simone R, Fratta P, Neidle S, Parkinson GN, Isaacs AM. 2015. G-quadruplexes: emerging roles in neurodegenerative diseases and the non-coding transcriptome. FEBS Lett 589: 1653–1668. 10.1016/j.febslet.2015.05.003 PubMed DOI

Sinden RR, Pytlos-Sinden MJ, Potaman VN. 2007. Slipped strand DNA structures. Front Biosci 12: 4788–4799. 10.2741/2427 PubMed DOI

Smit AFA, Hubley R, Green P. 2004. RepeatMasker Open-3.0. http://www.repeatmasker.org

Sweasy JB, Lauper JM, Eckert KA. 2006. DNA polymerases and human diseases. Radiat Res 166: 693–714. 10.1667/RR0706.1 PubMed DOI

Taylor S, Pollard K. 2009. Hypothesis tests for point-mass mixture data with application to ‘omics data with many zero values. Stat Appl Genet Mol Biol 8: Article 8 10.2202/1544-6115.1425 PubMed DOI

Turner S, Kuse R, Kearns G, Monadgemi P, Foquet M, Martinez D. 2017. Nanoscale apertures having islands of functionality. U.S. patent no. US9637380. https://www.google.com/patents/US9637380

Usdin K, Woodford KJ. 1995. CGG repeats associated with DNA instability and chromosome fragility form structures that block DNA synthesis in vitro. Nucleic Acids Res 23: 4202–4209. 10.1093/nar/23.20.4202 PubMed DOI PMC

Valton AL, Prioleau MN. 2016. G-Quadruplexes in DNA replication: a problem or a necessity? Trends Genet 32: 697–706. 10.1016/j.tig.2016.09.004 PubMed DOI

Voineagu I, Narayanan V, Lobachev KS, Mirkin SM. 2008. Replication stalling at unstable inverted repeats: interplay between DNA hairpins and fork stabilizing proteins. Proc Natl Acad Sci 105: 9936–9941. 10.1073/pnas.0804510105 PubMed DOI PMC

Wang G, Vasquez KM. 2007. Z-DNA, an active element in the genome. Front Biosci 12: 4424–4438. 10.2741/2399 PubMed DOI

Wang G, Carbajal S, Vijg J, DiGiovanni J, Vasquez KM. 2008. DNA structure-induced genomic instability in vivo. J Natl Cancer Inst 100: 1815–1817. 10.1093/jnci/djn385 PubMed DOI PMC

Watson JD, Crick FHC. 1953. Genetical implications of the structure of deoxyribonucleic acid. Nature 171: 964–967. 10.1038/171964b0 PubMed DOI

Wittig B, Dorbic T, Rich A. 1991. Transcription is associated with Z-DNA formation in metabolically active permeabilized mammalian cell nuclei. Proc Natl Acad Sci 88: 2259–2263. 10.1073/pnas.88.6.2259 PubMed DOI PMC

Zhao J, Bacolla A, Wang G, Vasquez KM. 2010. Non-B DNA structure-induced genetic instability and evolution. Cell Mol Life Sci 67: 43–62. 10.1007/s00018-009-0131-2 PubMed DOI PMC

Zook JM, Catoe D, McDaniel J, Vang L, Spies N, Sidow A, Weng Z, Liu Y, Mason CE, Alexander N, et al. 2016. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci Data 3: 160025 10.1038/sdata.2016.25 PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...