MAGERI: Computational pipeline for molecular-barcoded targeted resequencing
Jazyk angličtina Země Spojené státy americké Médium electronic-ecollection
Typ dokumentu časopisecké články, práce podpořená grantem
PubMed
28475621
PubMed Central
PMC5419444
DOI
10.1371/journal.pcbi.1005480
PII: PCOMPBIOL-D-16-01290
Knihovny.cz E-zdroje
- MeSH
- databáze genetické MeSH
- lidé MeSH
- nádorové biomarkery krev genetika MeSH
- nádory genetika MeSH
- RNA virová genetika MeSH
- sekvenční analýza DNA metody MeSH
- sekvenční analýza RNA metody MeSH
- software * MeSH
- výpočetní biologie metody MeSH
- vysoce účinné nukleotidové sekvenování metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Názvy látek
- nádorové biomarkery MeSH
- RNA virová MeSH
Unique molecular identifiers (UMIs) show outstanding performance in targeted high-throughput resequencing, being the most promising approach for the accurate identification of rare variants in complex DNA samples. This approach has application in multiple areas, including cancer diagnostics, thus demanding dedicated software and algorithms. Here we introduce MAGERI, a computational pipeline that efficiently handles all caveats of UMI-based analysis to obtain high-fidelity mutation profiles and call ultra-rare variants. Using an extensive set of benchmark datasets including gold-standard biological samples with known variant frequencies, cell-free DNA from tumor patient blood samples and publicly available UMI-encoded datasets we demonstrate that our method is both robust and efficient in calling rare variants. The versatility of our software is supported by accurate results obtained for both tumor DNA and viral RNA samples in datasets prepared using three different UMI-based protocols.
Central European Institute of Technology Masaryk University Brno Czech republic
Evrogen JSC Miklukho Maklaya 16 10 Moscow Russia
Pirogov Russian National Research Medical University Ostrovityanova 1 Moscow Russia
Shemyakin Ovchinnikov Institute of bioorganic chemistry RAS Miklukho Maklaya 16 10 Moscow Russia
Skolkovo Institute of Science and Technology Nobel 3 Moscow Russia
Zobrazit více v PubMed
Diehl F, Schmidt K, Durkee KH, Moore KJ, Goodman SN, et al. (2008) Analysis of mutations in DNA isolated from plasma and stool of colorectal cancer patients. Gastroenterology 135: 489–498. doi: 10.1053/j.gastro.2008.05.039 PubMed DOI PMC
Fan HC, Blumenfeld YJ, Chitkara U, Hudgins L, Quake SR (2008) Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood. Proc Natl Acad Sci U S A 105: 16266–16271. doi: 10.1073/pnas.0808319105 PubMed DOI PMC
Burrell RA, Swanton C (2014) The evolution of the unstable cancer genome. Curr Opin Genet Dev 24: 61–67. doi: 10.1016/j.gde.2013.11.011 PubMed DOI
Colman RE, Schupp JM, Hicks ND, Smith DE, Buchhagen JL, et al. (2015) Detection of Low-Level Mixed-Population Drug Resistance in Mycobacterium tuberculosis Using High Fidelity Amplicon Sequencing. PLoS One 10: e0126626 doi: 10.1371/journal.pone.0126626 PubMed DOI PMC
Van Laethem K, Theys K, Vandamme AM (2015) HIV-1 genotypic drug resistance testing: digging deep, reaching wide? Curr Opin Virol 14: 16–23. doi: 10.1016/j.coviro.2015.06.001 PubMed DOI
Faith JJ, Guruge JL, Charbonneau M, Subramanian S, Seedorf H, et al. (2013) The long-term stability of the human gut microbiota. Science 341: 1237439 doi: 10.1126/science.1237439 PubMed DOI PMC
Barrick JE, Lenski RE (2013) Genome dynamics during experimental evolution. Nat Rev Genet 14: 827–839. doi: 10.1038/nrg3564 PubMed DOI PMC
Bettegowda C, Sausen M, Leary RJ, Kinde I, Wang Y, et al. (2014) Detection of circulating tumor DNA in early- and late-stage human malignancies. Sci Transl Med 6: 224ra224. PubMed PMC
Fernandez-Cuesta L, Perdomo S, Avogbe PH, Leblay N, Delhomme TM, et al. (2016) Identification of Circulating Tumor DNA for the Early Detection of Small-cell Lung Cancer. EBioMedicine 10: 117–123. doi: 10.1016/j.ebiom.2016.06.032 PubMed DOI PMC
Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, et al. (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22: 568–576. doi: 10.1101/gr.129684.111 PubMed DOI PMC
Cibulskis K, Lawrence MS, Carter SL, Sivachenko A, Jaffe D, et al. (2013) Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnol 31: 213–219. doi: 10.1038/nbt.2514 PubMed DOI PMC
Harismendy O, Schwab RB, Bao L, Olson J, Rozenzhak S, et al. (2011) Detection of low prevalence somatic mutations in solid tumors with ultra-deep targeted sequencing. Genome Biol 12: R124 doi: 10.1186/gb-2011-12-12-r124 PubMed DOI PMC
Diehl F, Li M, Dressman D, He Y, Shen D, et al. (2005) Detection and quantification of mutations in the plasma of patients with colorectal tumors. Proc Natl Acad Sci U S A 102: 16368–16373. doi: 10.1073/pnas.0507904102 PubMed DOI PMC
Fleischhacker M, Schmidt B (2007) Circulating nucleic acids (CNAs) and cancer—a survey. Biochim Biophys Acta 1775: 181–232. doi: 10.1016/j.bbcan.2006.10.001 PubMed DOI
Chen Z, Feng J, Buzin CH, Liu Q, Weiss L, et al. (2009) Analysis of cancer mutation signatures in blood by a novel ultra-sensitive assay: monitoring of therapy or recurrence in non-metastatic breast cancer. PLoS One 4: e7220 doi: 10.1371/journal.pone.0007220 PubMed DOI PMC
Tie J, Kinde I, Wang Y, Wong HL, Roebert J, et al. (2015) Circulating tumor DNA as an early marker of therapeutic response in patients with metastatic colorectal cancer. Ann Oncol 26: 1715–1722. doi: 10.1093/annonc/mdv177 PubMed DOI PMC
Newman AM, Bratman SV, To J, Wynne JF, Eclov NC, et al. (2014) An ultrasensitive method for quantitating circulating tumor DNA with broad patient coverage. Nat Med 20: 548–554. doi: 10.1038/nm.3519 PubMed DOI PMC
Blomquist T, Crawford EL, Yeo J, Zhang X, Willey JC (2015) Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing. Biomol Detect Quantif 5: 30–37. doi: 10.1016/j.bdq.2015.08.003 PubMed DOI PMC
Richardson AL, Iglehart JD (2012) BEAMing up personalized medicine: mutation detection in blood. Clin Cancer Res 18: 3209–3211. doi: 10.1158/1078-0432.CCR-12-0871 PubMed DOI PMC
Casbon JA, Osborne RJ, Brenner S, Lichtenstein CP (2011) A method for counting PCR template molecules with application to next-generation sequencing. Nucleic Acids Res 39: e81 doi: 10.1093/nar/gkr217 PubMed DOI PMC
Kinde I, Wu J, Papadopoulos N, Kinzler KW, Vogelstein B (2011) Detection and quantification of rare mutations with massively parallel sequencing. Proc Natl Acad Sci U S A 108: 9530–9535. doi: 10.1073/pnas.1105422108 PubMed DOI PMC
Kivioja T, Vaharautio A, Karlsson K, Bonke M, Enge M, et al. (2012) Counting absolute numbers of molecules using unique molecular identifiers. Nat Methods 9: 72–74. PubMed
Grun D, Kester L, van Oudenaarden A (2014) Validation of noise models for single-cell transcriptomics. Nat Methods 11: 637–640. doi: 10.1038/nmeth.2930 PubMed DOI
Gout JF, Thomas WK, Smith Z, Okamoto K, Lynch M (2013) Large-scale detection of in vivo transcription errors. Proc Natl Acad Sci U S A 110: 18584–18589. doi: 10.1073/pnas.1309843110 PubMed DOI PMC
Deakin CT, Deakin JJ, Ginn SL, Young P, Humphreys D, et al. (2014) Impact of next-generation sequencing error on analysis of barcoded plasmid libraries of known complexity and sequence. Nucleic Acids Res 42: e129 doi: 10.1093/nar/gku607 PubMed DOI PMC
Vollmers C, Sit RV, Weinstein JA, Dekker CL, Quake SR (2013) Genetic measurement of memory B-cell recall using antibody repertoire sequencing. Proc Natl Acad Sci U S A 110: 13463–13468. doi: 10.1073/pnas.1312146110 PubMed DOI PMC
Britanova OV, Putintseva EV, Shugay M, Merzlyak EM, Turchaninova MA, et al. (2014) Age-related decrease in TCR repertoire diversity measured with deep and normalized sequence profiling. J Immunol 192: 2689–2698. doi: 10.4049/jimmunol.1302064 PubMed DOI
Shugay M, Britanova OV, Merzlyak EM, Turchaninova MA, Mamedov IZ, et al. (2014) Towards error-free profiling of immune repertoires. Nat Methods 11: 653–655. doi: 10.1038/nmeth.2960 PubMed DOI
He L, Sok D, Azadnia P, Hsueh J, Landais E, et al. (2014) Toward a more accurate view of human B-cell repertoire by next-generation sequencing, unbiased repertoire capture and single-molecule barcoding. Sci Rep 4: 6778 doi: 10.1038/srep06778 PubMed DOI PMC
Egorov ES, Merzlyak EM, Shelenkov AA, Britanova OV, Sharonov GV, et al. (2015) Quantitative profiling of immune repertoires for minor lymphocyte counts using unique molecular identifiers. J Immunol 194: 6155–6163. doi: 10.4049/jimmunol.1500215 PubMed DOI
Bottcher R, Amberg R, Ruzius FP, Guryev V, Verhaegh WF, et al. (2012) Using a priori knowledge to align sequencing reads to their exact genomic position. Nucleic Acids Res 40: e125 doi: 10.1093/nar/gks393 PubMed DOI PMC
Dominguez PL, Kolodney MS (2005) Wild-type blocking polymerase chain reaction for detection of single nucleotide minority mutations from clinical specimens. Oncogene 24: 6830–6834. doi: 10.1038/sj.onc.1208832 PubMed DOI
Bernard PS, Lay MJ, Wittwer CT (1998) Integrated amplification and detection of the C677T point mutation in the methylenetetrahydrofolate reductase gene by fluorescence resonance energy transfer and probe melting curves. Anal Biochem 255: 101–107. doi: 10.1006/abio.1997.2427 PubMed DOI
Chiu RW, Chan LY, Lam NY, Tsui NB, Ng EK, et al. (2003) Quantitative analysis of circulating mitochondrial DNA in plasma. Clin Chem 49: 719–726. PubMed
Shcherbo D, Shemiakina II, Ryabova AV, Luker KE, Schmidt BT, et al. (2010) Near-infrared fluorescent proteins. Nat Methods 7: 827–829. doi: 10.1038/nmeth.1501 PubMed DOI PMC
Matz M, Shagin D, Bogdanova E, Britanova O, Lukyanov S, et al. (1999) Amplification of cDNA ends based on template-switching effect and step-out PCR. Nucleic Acids Res 27: 1558–1560. PubMed PMC
Zhou S, Jones C, Mieczkowski P, Swanstrom R (2015) Primer ID Validates Template Sampling Depth and Greatly Reduces the Error Rate of Next-Generation Sequencing of HIV-1 Genomic RNA Populations. J Virol 89: 8540–8555. doi: 10.1128/JVI.00522-15 PubMed DOI PMC
Kou R, Lam H, Duan H, Ye L, Jongkam N, et al. (2016) Benefits and Challenges with Applying Unique Molecular Identifiers in Next Generation Sequencing to Detect Low Frequency Mutations. PLoS One 11: e0146638 doi: 10.1371/journal.pone.0146638 PubMed DOI PMC
Hovelson DH, McDaniel AS, Cani AK, Johnson B, Rhodes K, et al. (2015) Development and validation of a scalable next-generation sequencing system for assessing relevant somatic variants in solid tumors. Neoplasia 17: 385–399. doi: 10.1016/j.neo.2015.03.004 PubMed DOI PMC
D'Haene N, Le Mercier M, De Neve N, Blanchard O, Delaunoy M, et al. (2015) Clinical Validation of Targeted Next Generation Sequencing for Colon and Lung Cancers. PLoS One 10: e0138245 doi: 10.1371/journal.pone.0138245 PubMed DOI PMC
Rubinstein JC, Sznol M, Pavlick AC, Ariyan S, Cheng E, et al. (2010) Incidence of the V600K mutation among melanoma patients with BRAF mutations, and potential therapeutic response to the specific BRAF inhibitor PLX4032. J Transl Med 8: 67 doi: 10.1186/1479-5876-8-67 PubMed DOI PMC
Schmitt MW, Kennedy SR, Salk JJ, Fox EJ, Hiatt JB, et al. (2012) Detection of ultra-rare mutations by next-generation sequencing. Proc Natl Acad Sci U S A 109: 14508–14513. doi: 10.1073/pnas.1208715109 PubMed DOI PMC
Schmitt MW, Fox EJ, Prindle MJ, Reid-Bayliss KS, True LD, et al. (2015) Sequencing small genomic targets with high efficiency and extreme accuracy. Nat Methods 12: 423–425. doi: 10.1038/nmeth.3351 PubMed DOI PMC
Shao W, Boltz VF, Spindler JE, Kearney MF, Maldarelli F, et al. (2013) Analysis of 454 sequencing error rate, error sources, and artifact recombination for detection of Low-frequency drug resistance mutations in HIV-1 DNA. Retrovirology 10: 18 doi: 10.1186/1742-4690-10-18 PubMed DOI PMC
Yeo ZX, Chan M, Yap YS, Ang P, Rozen S, et al. (2012) Improving indel detection specificity of the Ion Torrent PGM benchtop sequencer. PLoS One 7: e45798 doi: 10.1371/journal.pone.0045798 PubMed DOI PMC
Schirmer M, D'Amore R, Ijaz UZ, Hall N, Quince C (2016) Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data. BMC Bioinformatics 17: 125 doi: 10.1186/s12859-016-0976-y PubMed DOI PMC
Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, et al. (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 6: 80–92. PubMed PMC
Usuyama N, Shiraishi Y, Sato Y, Kume H, Homma Y, et al. (2014) HapMuC: somatic mutation calling using heterozygous germ line variants near candidate mutations. Bioinformatics 30: 3302–3309. doi: 10.1093/bioinformatics/btu537 PubMed DOI PMC