Nejvíce citovaný článek - PubMed ID 17274688
The pandemic caused by the spread of SARS-CoV-2 has led to considerable interest in its evolutionary origin and genome structure. Here, we analyzed mutation patterns in 34 human SARS-CoV-2 isolates and a closely related RaTG13 isolated from Rhinolophus affinis (a horseshoe bat). We also evaluated the CpG dinucleotide contents in SARS-CoV-2 and other human and animal coronavirus genomes. Out of 1136 single nucleotide variations (~4% divergence) between human SARS-CoV-2 and bat RaTG13, 682 (60%) can be attributed to C>U and U>C substitutions, far exceeding other types of substitutions. An accumulation of C>U mutations was also observed in SARS-CoV2 variants that arose within the human population. Globally, the C>U substitutions increased the frequency of codons for hydrophobic amino acids in SARS-CoV-2 peptides, while U>C substitutions decreased it. In contrast to most other coronaviruses, both SARS-CoV-2 and RaTG13 exhibited CpG depletion in their genomes. The data suggest that C-to-U conversion mediated by C deamination played a significant role in the evolution of the SARS-CoV-2 coronavirus. We hypothesize that the high frequency C>U transitions reflect virus adaptation processes in their hosts, and that SARS-CoV-2 could have been evolving for a relatively long period in humans following the transfer from animals before spreading worldwide.
- Klíčová slova
- CpG depletion, SARS-CoV-2, coronavirus, cytosine deamination, evolution, mutation bias,
- MeSH
- Betacoronavirus klasifikace genetika izolace a purifikace MeSH
- Chiroptera virologie MeSH
- CpG ostrůvky MeSH
- cytosin metabolismus MeSH
- fylogeneze MeSH
- glykoprotein S, koronavirus genetika MeSH
- jednonukleotidový polymorfismus MeSH
- lidé MeSH
- molekulární evoluce * MeSH
- SARS-CoV-2 MeSH
- sekvence nukleotidů MeSH
- uracil metabolismus MeSH
- virus SARS klasifikace genetika izolace a purifikace MeSH
- zvířata MeSH
- Check Tag
- lidé MeSH
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- cytosin MeSH
- glykoprotein S, koronavirus MeSH
- uracil MeSH
Mutations can be induced by environmental factors but also arise spontaneously during DNA replication or due to deamination of methylated cytosines at CpG dinucleotides. Sites where mutations occur with higher frequency than would be expected by chance are termed hotspots while sites that contain mutations rarely are termed coldspots. Mutations are permanently scanned and repaired by repair systems. Among them, the mismatch repair targets base pair mismatches, which are discriminated from canonical base pairs by probing altered elasticity of DNA. Using biased molecular dynamics simulations, we investigated the elasticity of coldspots and hotspots motifs detected in human genes associated with inherited disorders, and also of motifs with Czech population hotspots and de novo mutations. Main attention was paid to mutations leading to G/T and A+/C pairs. We observed that hotspots without CpG/CpHpG sequences are less flexible than coldspots, which indicates that flexible sequences are more effectively repaired. In contrary, hotspots with CpG/CpHpG sequences exhibited increased flexibility as coldspots. Their mutability is more likely related to spontaneous deamination of methylated cytosines leading to C > T mutations, which are primarily targeted by base excision repair. We corroborated conclusions based on computer simulations by measuring melting curves of hotspots and coldspots containing G/T mismatch.
- Klíčová slova
- DNA bending, Muts protein, free energy calculations, hotspots–coldspots, mutations,
- MeSH
- CpG ostrůvky MeSH
- DNA chemie genetika MeSH
- lidé MeSH
- mutace * MeSH
- nukleotidové motivy * MeSH
- simulace molekulární dynamiky * MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- DNA MeSH
The accuracy with which DNA polymerase can replicate a template DNA sequence is an extremely important property that can vary by an order of magnitude from one enzyme to another. The rate of nucleotide misincorporation is shaped by multiple factors, including PCR conditions and proofreading capabilities, and proper assessment of polymerase error rate is essential for a wide range of sensitive PCR-based assays. In this paper, we describe a method for studying polymerase errors with exceptional resolution, which combines unique molecular identifier tagging and high-throughput sequencing. Our protocol is less laborious than commonly-used methods, and is also scalable, robust and accurate. In a series of nine PCR assays, we have measured a range of polymerase accuracies that is in line with previous observations. However, we were also able to comprehensively describe individual errors introduced by each polymerase after either 20 PCR cycles or a linear amplification, revealing specific substitution preferences and the diversity of PCR error frequency profiles. We also demonstrate that the detected high-frequency PCR errors are highly recurrent and that the position in the template sequence and polymerase-specific substitution preferences are among the major factors influencing the observed PCR error rate.
In all eukaryotes, the highly repeated 35S ribosomal DNA (rDNA) sequences encoding 18S-5.8S-26S ribosomal RNA (rRNA) typically show high levels of intragenomic uniformity due to homogenisation processes, leading to concerted evolution of 35S rDNA repeats. Here, we compared 35S rDNA divergence in several seed plants using next generation sequencing and a range of molecular and cytogenetic approaches. Most species showed similar 35S rDNA homogeneity indicating concerted evolution. However, Cycas revoluta exhibits an extraordinary diversity of rDNA repeats (nucleotide sequence divergence of different copies averaging 12 %), influencing both the coding and non-coding rDNA regions nearly equally. In contrast, its rRNA transcriptome was highly homogeneous suggesting that only a minority of genes (<20 %) encode functional rRNA. The most common SNPs were C > T substitutions located in symmetrical CG and CHG contexts which were also highly methylated. Both functional genes and pseudogenes appear to cluster on chromosomes. The extraordinary high levels of 35S rDNA diversity in C. revoluta, and probably other species of cycads, indicate that the frequency of repeat homogenisation has been much lower in this lineage, compared with all other land plant lineages studied. This has led to the accumulation of methylation-driven mutations and pseudogenisation. Potentially, the reduced homology between paralogs prevented their elimination by homologous recombination, resulting in long-term retention of rDNA pseudogenes in the genome.
- Klíčová slova
- Concerted evolution, Cycadales, Cytosine methylation, Living fossil, rDNA,
- MeSH
- Cycas genetika MeSH
- DNA rostlinná genetika MeSH
- genetická transkripce genetika MeSH
- hybridizace in situ fluorescenční MeSH
- jednonukleotidový polymorfismus genetika MeSH
- mezerníky ribozomální DNA genetika MeSH
- ribozomální DNA genetika MeSH
- RNA ribozomální 18S genetika MeSH
- RNA ribozomální 5.8S genetika MeSH
- RNA ribozomální genetika MeSH
- sekvence nukleotidů MeSH
- sekvenční analýza DNA MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- DNA rostlinná MeSH
- mezerníky ribozomální DNA MeSH
- ribozomální DNA MeSH
- RNA ribozomální 18S MeSH
- RNA ribozomální 5.8S MeSH
- RNA ribozomální MeSH
- RNA, ribosomal, 26S MeSH Prohlížeč