Long-read sequencing technology indicates genome-wide effects of non-B DNA on polymerization speed and error rate
Jazyk angličtina Země Spojené státy americké Médium print-electronic
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
30401733
PubMed Central
PMC6280752
DOI
10.1101/gr.241257.118
PII: gr.241257.118
Knihovny.cz E-zdroje
- MeSH
- DNA chemie MeSH
- G-kvadruplexy MeSH
- genomika * metody normy MeSH
- kinetika MeSH
- konformace nukleové kyseliny * MeSH
- lidé MeSH
- mutace MeSH
- nukleotidové motivy MeSH
- replikace DNA MeSH
- reprodukovatelnost výsledků MeSH
- sekvenční analýza DNA * metody MeSH
- vysoce účinné nukleotidové sekvenování * metody normy MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA MeSH
DNA conformation may deviate from the classical B-form in ∼13% of the human genome. Non-B DNA regulates many cellular processes; however, its effects on DNA polymerization speed and accuracy have not been investigated genome-wide. Such an inquiry is critical for understanding neurological diseases and cancer genome instability. Here, we present the first simultaneous examination of DNA polymerization kinetics and errors in the human genome sequenced with Single-Molecule Real-Time (SMRT) technology. We show that polymerization speed differs between non-B and B-DNA: It decelerates at G-quadruplexes and fluctuates periodically at disease-causing tandem repeats. Analyzing polymerization kinetics profiles, we predict and validate experimentally non-B DNA formation for a novel motif. We demonstrate that several non-B motifs affect sequencing errors (e.g., G-quadruplexes increase error rates), and that sequencing errors are positively associated with polymerase slowdown. Finally, we show that highly divergent G4 motifs have pronounced polymerization slowdown and high sequencing error rates, suggesting similar mechanisms for sequencing errors and germline mutations.
Department of Biology Penn State University University Park Pennsylvania 16802 USA
Department of Pathology Penn State University College of Medicine Hershey Pennsylvania 17033 USA
Department of Statistics Penn State University University Park Pennsylvania 16802 USA
Zobrazit více v PubMed
The 1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature 526: 68–74. 10.1038/nature15393 PubMed DOI PMC
Aitchison J. 1986. The statistical analysis of compositional data. Chapman and Hall, New York.
Ananda G, Hile SE, Breski A, Wang Y, Kelkar Y, Makova KD, Eckert KA. 2014. Microsatellite interruptions stabilize primate genomes and exist as population-specific single nucleotide polymorphisms within individual human genomes. PLoS Genet 10: e1004498 10.1371/journal.pgen.1004498 PubMed DOI PMC
Bacolla A, Wells RD. 2004. Non-B DNA conformations, genomic rearrangements, and human disease. J Biol Chem 279: 47411–47414. 10.1074/jbc.R400028200 PubMed DOI
Bacolla A, Jaworski A, Larson JE, Jakupciak JP, Chuzhanova N, Abeysinghe SS, O'Connell CD, Cooper DN, Wells RD. 2004. Breakpoints of gross deletions coincide with non-B DNA conformations. Proc Natl Acad Sci 101: 14162–14167. 10.1073/pnas.0405974101 PubMed DOI PMC
Bacolla A, Wang G, Jain A, Chuzhanova NA, Cer RZ, Collins JR, Cooper DN, Bohr VA, Vasquez KM. 2011. Non-B DNA-forming sequences and WRN deficiency independently increase the frequency of base substitution in human cells. J Biol Chem 286: 10017–10026. 10.1074/jbc.M110.176636 PubMed DOI PMC
Balasubramanian S, Hurley LH, Neidle S. 2011. Targeting G-quadruplexes in gene promoters: a novel anticancer strategy? Nat Rev Drug Discov 10: 261–275. 10.1038/nrd3428 PubMed DOI PMC
Belotserkovskii BP, Liu R, Tornaletti S, Krasilnikova MM, Mirkin SM, Hanawalt PC. 2010. Mechanisms and implications of transcription blockage by guanine-rich DNA sequences. Proc Natl Acad Sci 107: 12816–12821. 10.1073/pnas.1007580107 PubMed DOI PMC
Besnard E, Babled A, Lapasset L, Milhavet O, Parrinello H, Dantec C, Marin JM, Lemaitre JM. 2012. Unraveling cell type–specific and reprogrammable human replication origin signatures associated with G-quadruplex consensus motifs. Nat Struct Mol Biol 19: 837–844. 10.1038/nsmb.2339 PubMed DOI
Blanchette M, Kent WJ, Riemer C, Elnitski L, Smit AF, Roskin KM, Baertsch R, Rosenbloom K, Clawson H, Green ED, et al. 2004. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14: 708–715. 10.1101/gr.1933104 PubMed DOI PMC
Campos-Sánchez R, Cremona MA, Pini A, Chiaromonte F, Makova KD. 2016. Integration and fixation preferences of human and mouse endogenous retroviruses uncovered with functional data analysis. PLoS Comput Biol 12: e1004956 10.1371/journal.pcbi.1004956 PubMed DOI PMC
Castel AL, Cleary JD, Pearson CE. 2010. Repeat instability as the basis for human diseases and as a potential target for therapy. Nat Rev Mol Cell Biol 11: 165–170. 10.1038/nrm2854 PubMed DOI
Cer RZ, Donohue DE, Mudunuri US, Temiz NA, Loss MA, Starner NJ, Halusa GN, Volfovsky N, Yi M, Luke BT, et al. 2012. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools. Nucleic Acids Res 41: D94–D100. 10.1093/nar/gks955 PubMed DOI PMC
Chambers VS, Marsico G, Boutell JM, Di Antonio M, Smith GP, Balasubramanian S. 2015. High-throughput sequencing of DNA G-quadruplex structures in the human genome. Nat Biotechnol 33: 877–881. 10.1038/nbt.3295 PubMed DOI
Cooper DN, Bacolla A, Férec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. 2011. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease. Hum Mutat 32: 1075–1099. 10.1002/humu.21557 PubMed DOI PMC
Cremona MA, Pini A, Cumbo F, Makova KD, Chiaromonte F, Vantini S. 2018. IWTomics: testing high-resolution sequence-based ‘Omics’ data at multiple locations and scales. Bioinformatics 34: 2289–2291. 10.1093/bioinformatics/bty090 PubMed DOI PMC
Dailey MM, Miller MC, Bates PJ, Lane AN, Trent JO. 2010. Resolution and characterization of the structural polymorphism of a single quadruplex-forming sequence. Nucleic Acids Res 38: 4877–4888. 10.1093/nar/gkq166 PubMed DOI PMC
Delagoutte E, Goellner GM, Guo J, Baldacci G, McMurray CT. 2008. Single-stranded DNA-binding protein in vitro eliminates the orientation-dependent impediment to polymerase passage on CAG/CTG repeats. J Biol Chem 283: 13341–13356. 10.1074/jbc.M800153200 PubMed DOI PMC
Dolinnaya NG, Ogloblina AM, Yakubovskaya MG. 2016. Structure, properties, and biological relevance of the DNA and RNA G-quadruplexes: overview 50 years after their discovery. Biochemistry (Mosc) 81: 1602–1649. 10.1134/S0006297916130034 PubMed DOI PMC
Du X, Gertz EM, Wojtowicz D, Zhabinskaya D, Levens D, Benham CJ, Schäffer AA, Przytycka TM. 2014. Potential non-B DNA regions in the human genome are associated with higher rates of nucleotide mutation and expression variation. Nucleic Acids Res 42: 12367–12379. 10.1093/nar/gku921 PubMed DOI PMC
Eddy S, Maddukuri L, Ketkar A, Zafar MK, Henninger EE, Pursell ZF, Eoff RL. 2015. Evidence for the kinetic partitioning of polymerase activity on G-quadruplex DNA. Biochemistry 54: 3218–3230. 10.1021/acs.biochem.5b00060 PubMed DOI PMC
Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, et al. 2009. Real-time DNA sequencing from single polymerase molecules. Science 323: 133–138. 10.1126/science.1162986 PubMed DOI
Flusberg BA, Webster DR, Lee JH, Travers KJ, Olivares EC, Clark TA, Korlach J, Turner SW. 2010. Direct detection of DNA methylation during single-molecule, real-time sequencing. Nat Methods 7: 461–465. 10.1038/nmeth.1459 PubMed DOI PMC
Fry M, Loeb LA. 1992. A DNA polymerase α pause site is a hot spot for nucleotide misinsertion. Proc Natl Acad Sci 89: 763–767. 10.1073/pnas.89.2.763 PubMed DOI PMC
Fungtammasan A, Ananda G, Hile SE, Su MS, Sun C, Harris R, Medvedev P, Eckert K, Makova KD. 2015. Accurate typing of short tandem repeats from genome-wide sequencing data and its applications. Genome Res 25: 736–749. 10.1101/gr.185892.114 PubMed DOI PMC
Haberman Y, Amariglio N, Rechavi G, Eisenberg E. 2008. Trinucleotide repeats are prevalent among cancer-related genes. Trends Genet 24: 14–18. 10.1016/j.tig.2007.09.005 PubMed DOI
Harris RS. 2007. “Improved pairwise alignment of genomic DNA.” PhD thesis, Pennsylvania State University.
Hile SE, Eckert KA. 2004. Positive correlation between DNA polymerase α-primase pausing and mutagenesis within polypyrimidine/polypurine microsatellite sequences. J Mol Biol 335: 745–759. 10.1016/j.jmb.2003.10.075 PubMed DOI
Hodgkinson A, Eyre-Walker A. 2011. Variation in the mutation rate across mammalian genomes. Nat Rev Genet 12: 756–766. 10.1038/nrg3098 PubMed DOI
Huppert JL, Balasubramanian S. 2005. Prevalence of quadruplexes in the human genome. Nucleic Acids Res 33: 2908–2916. 10.1093/nar/gki609 PubMed DOI PMC
Inagaki H, Ohye T, Kogo H, Tsutsumi M, Kato T, Tong M, Emanuel BS, Kurahashi H. 2013. Two sequential cleavage reactions on cruciform DNA structures cause palindrome-mediated chromosomal translocations. Nat Commun 4: 1592 10.1038/ncomms2595 PubMed DOI
Jansen A, van der Zande E, Meert W, Fink GR, Verstrepen KJ. 2012. Distal chromatin structure influences local nucleosome positions and gene expression. Nucleic Acids Res 40: 3870–3885. 10.1093/nar/gkr1311 PubMed DOI PMC
Kamat MA, Bacolla A, Cooper DN, Chuzhanova N. 2016. A role for non-B DNA forming sequences in mediating microlesions causing human inherited disease. Hum Mutat 37: 65–73. 10.1002/humu.22917 PubMed DOI
Kang S, Ohshima K, Shimizu M, Amirhaeri S, Wells RD. 1995. Pausing of DNA synthesis in vitro at specific loci in CTG and CGG triplet repeats from human hereditary disease genes. J Biol Chem 270: 27014–27021. 10.1074/jbc.270.45.27014 PubMed DOI
Kejnovská I, Bednárová K, Renciuk D, Dvoráková Z, Školáková P, Trantírek L, Fiala R, Vorlícková M, Sagi J. 2017. Clustered abasic lesions profoundly change the structure and stability of human telomeric G-quadruplexes. Nucleic Acids Res 45: 4294–4305. 10.1093/nar/gkx191 PubMed DOI PMC
Krasilnikova MM, Mirkin SM. 2004. Replication stalling at Friedreich's Ataxia (GAA)n repeats in vivo. Mol Cell Biol 24: 2286–2295. 10.1128/MCB.24.6.2286-2295.2004 PubMed DOI PMC
Kypr J, Kejnovská I, Renciuk D, Vorlícková M. 2009. Circular dichroism and conformational polymorphism of DNA. Nucleic Acids Res 37: 1713–1725. 10.1093/nar/gkp026 PubMed DOI PMC
Lachenbruch PA. 1976. Analysis of data with clumping at zero. Biom Z 18: 351–356.
Locke DP, Hillier LW, Warren WC, Worley KC, Nazareth LV, Muzny DM, Yang SP, Wang Z, Chinwalla AT, Minx P, et al. 2011. Comparative and demographic analysis of orang-utan genomes. Nature 469: 529–533. 10.1038/nature09687 PubMed DOI PMC
Loomis EW, Eid JS, Peluso P, Yin J, Hickey L, Rank D, McCalmon S, Hagerman RJ, Tassone F, Hagerman PJ. 2013. Sequencing the unsequenceable: expanded CGG-repeat alleles of the fragile X gene. Genome Res 23: 121–128. 10.1101/gr.141705.112 PubMed DOI PMC
Maizels N. 2015. G4-associated human diseases. EMBO Rep 16: 910–922. 10.15252/embr.201540607 PubMed DOI PMC
Makova KD, Hardison RC. 2015. The effects of chromatin organization on variation in mutation rates in the genome. Nat Rev Genet 16: 213–223. 10.1038/nrg3890 PubMed DOI PMC
Makova KD, Li WH. 2002. Strong male-driven evolution of DNA sequences in humans and apes. Nature 416: 624–626. 10.1038/416624a PubMed DOI
Mirkin SM. 2007. Expandable DNA repeats and human disease. Nature 447: 932–940. 10.1038/nature05977 PubMed DOI
Mirkin SM. 2008. Discovery of alternative DNA structures: a heroic decade (1979-1989). Front Biosci 13: 1064–1071. 10.2741/2744 PubMed DOI
Mirkin EV, Mirkin SM. 2007. Replication fork stalling at natural impediments. Microbiol Mol Biol Rev 71: 13–35. 10.1128/MMBR.00030-06 PubMed DOI PMC
Mirkin SM, Lyamichev VI, Drushlyak KN, Dobrynin VN, Filippov SA, Frank-Kamenetskii MD. 1987. DNA H form requires a homopurine–homopyrimidine mirror repeat. Nature 330: 495–497. 10.1038/330495a0 PubMed DOI
Nadel Y, Weisman-Shomer P, Fry M. 1995. The fragile X syndrome single strand d(CGG)n nucleotide repeats readily fold back to form unimolecular hairpin structures. J Biol Chem 270: 28970–28977. 10.1074/jbc.270.48.28970 PubMed DOI
Neidle S, Balasubramanian S. 2006. Quadruplex nucleic acids. Royal Society of Chemistry, London.
Parkinson GN, Lee MP, Neidle S. 2002. Crystal structure of parallel quadruplexes from human telomeric DNA. Nature 417: 876–880. 10.1038/nature755 PubMed DOI
Pawlowsky-Glahn V, Egozcue JJ, Tolosana-Delgado R. 2015. Modeling and analysis of compositional data. John Wiley & Sons, Chichester, UK.
Pini A, Vantini S. 2017. Interval-wise testing for functional data. J Nonparametr Stat 29: 407–424.
Renton AE, Majounie E, Waite A, Simón-Sánchez J, Rollinson S, Gibbs JR, Schymick JC, Laaksovirta H, van Swieten JC, Myllykangas L, et al. 2011. A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 72: 257–268. 10.1016/j.neuron.2011.09.010 PubMed DOI PMC
Rhoads A, Au KF. 2015. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13: 278–289. 10.1016/j.gpb.2015.08.002 PubMed DOI PMC
Samadashwily GM, Raca G, Mirkin SM. 1997. Trinucleotide repeats affect DNA replication in vivo. Nat Genet 17: 298–304. 10.1038/ng1197-298 PubMed DOI
Sawaya S, Bagshaw A, Buschiazzo E, Kumar P, Chowdhury S, Black MA, Gemmell N. 2013. Microsatellite tandem repeats are abundant in human promoters and are associated with regulatory elements. PLoS One 8: e54710 10.1371/journal.pone.0054710 PubMed DOI PMC
Sawaya S, Boocock J, Black MA, Gemmell NJ. 2015. Exploring possible DNA structures in real-time polymerase kinetics using Pacific Biosciences sequencer data. BMC Bioinformatics 16: 21 10.1186/s12859-014-0449-0 PubMed DOI PMC
Schadt EE, Banerjee O, Fang G, Feng Z, Wong WH, Zhang X, Kislyuk A, Clark TA, Luong K, Keren-Paz A, et al. 2013. Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases. Genome Res 23: 129–141. 10.1101/gr.136739.111 PubMed DOI PMC
Sen D, Gilbert W. 1988. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature 334: 364–366. 10.1038/334364a0 PubMed DOI
Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH. 2002. Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci 99: 11593–11598. 10.1073/pnas.182256799 PubMed DOI PMC
Simone R, Fratta P, Neidle S, Parkinson GN, Isaacs AM. 2015. G-quadruplexes: emerging roles in neurodegenerative diseases and the non-coding transcriptome. FEBS Lett 589: 1653–1668. 10.1016/j.febslet.2015.05.003 PubMed DOI
Sinden RR, Pytlos-Sinden MJ, Potaman VN. 2007. Slipped strand DNA structures. Front Biosci 12: 4788–4799. 10.2741/2427 PubMed DOI
Smit AFA, Hubley R, Green P. 2004. RepeatMasker Open-3.0. http://www.repeatmasker.org
Sweasy JB, Lauper JM, Eckert KA. 2006. DNA polymerases and human diseases. Radiat Res 166: 693–714. 10.1667/RR0706.1 PubMed DOI
Taylor S, Pollard K. 2009. Hypothesis tests for point-mass mixture data with application to ‘omics data with many zero values. Stat Appl Genet Mol Biol 8: Article 8 10.2202/1544-6115.1425 PubMed DOI
Turner S, Kuse R, Kearns G, Monadgemi P, Foquet M, Martinez D. 2017. Nanoscale apertures having islands of functionality. U.S. patent no. US9637380. https://www.google.com/patents/US9637380
Usdin K, Woodford KJ. 1995. CGG repeats associated with DNA instability and chromosome fragility form structures that block DNA synthesis in vitro. Nucleic Acids Res 23: 4202–4209. 10.1093/nar/23.20.4202 PubMed DOI PMC
Valton AL, Prioleau MN. 2016. G-Quadruplexes in DNA replication: a problem or a necessity? Trends Genet 32: 697–706. 10.1016/j.tig.2016.09.004 PubMed DOI
Voineagu I, Narayanan V, Lobachev KS, Mirkin SM. 2008. Replication stalling at unstable inverted repeats: interplay between DNA hairpins and fork stabilizing proteins. Proc Natl Acad Sci 105: 9936–9941. 10.1073/pnas.0804510105 PubMed DOI PMC
Wang G, Vasquez KM. 2007. Z-DNA, an active element in the genome. Front Biosci 12: 4424–4438. 10.2741/2399 PubMed DOI
Wang G, Carbajal S, Vijg J, DiGiovanni J, Vasquez KM. 2008. DNA structure-induced genomic instability in vivo. J Natl Cancer Inst 100: 1815–1817. 10.1093/jnci/djn385 PubMed DOI PMC
Watson JD, Crick FHC. 1953. Genetical implications of the structure of deoxyribonucleic acid. Nature 171: 964–967. 10.1038/171964b0 PubMed DOI
Wittig B, Dorbic T, Rich A. 1991. Transcription is associated with Z-DNA formation in metabolically active permeabilized mammalian cell nuclei. Proc Natl Acad Sci 88: 2259–2263. 10.1073/pnas.88.6.2259 PubMed DOI PMC
Zhao J, Bacolla A, Wang G, Vasquez KM. 2010. Non-B DNA structure-induced genetic instability and evolution. Cell Mol Life Sci 67: 43–62. 10.1007/s00018-009-0131-2 PubMed DOI PMC
Zook JM, Catoe D, McDaniel J, Vang L, Spies N, Sidow A, Weng Z, Liu Y, Mason CE, Alexander N, et al. 2016. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci Data 3: 160025 10.1038/sdata.2016.25 PubMed DOI PMC
Non-canonical DNA in human and other ape telomere-to-telomere genomes
Intragenomic rDNA variation - the product of concerted evolution, mutation, or something in between?
Accurate sequencing of DNA motifs able to form alternative (non-B) structures
In-Depth Sequence Analysis of Bread Wheat VRN1 Genes