Analytical parameters and validation of homopolymer detection in a pyrosequencing-based next generation sequencing system
Language English Country England, Great Britain Media electronic
Document type Journal Article, Research Support, Non-U.S. Gov't, Validation Study
Grant support
K109076
Országos Tudományos Kutatási Alapprogramok - International
00064203
Ministerstvo Zdravotnictví Ceské Republiky - International
CZ.2.16/3.1.00/24022OPPK
European Regional Development Fund Prague - International
NF-CZ11-PDP-3-003-2014, LM2015091 and COST-LD14073
Norway Grants - International
UNKP-16-3 IV/4
New National Excellence Program of the Ministry of Human Capacities - International
GINOP-2.3.2-15-2016-00039
Ministry of National Economy, Hungary - International
PubMed
29466940
PubMed Central
PMC5822529
DOI
10.1186/s12864-018-4544-x
PII: 10.1186/s12864-018-4544-x
Knihovny.cz E-resources
- Keywords
- Cystic fibrosis, Homopolymer detection, Pyrosequencing,
- MeSH
- Cystic Fibrosis genetics MeSH
- Humans MeSH
- Plasmids MeSH
- Cystic Fibrosis Transmembrane Conductance Regulator genetics MeSH
- Sequence Analysis, DNA methods MeSH
- Tandem Repeat Sequences * MeSH
- High-Throughput Nucleotide Sequencing methods MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Validation Study MeSH
- Names of Substances
- CFTR protein, human MeSH Browser
- Cystic Fibrosis Transmembrane Conductance Regulator MeSH
BACKGROUND: Current technologies in next-generation sequencing are offering high throughput reads at low costs, but still suffer from various sequencing errors. Although pyro- and ion semiconductor sequencing both have the advantage of delivering long and high quality reads, problems might occur when sequencing homopolymer-containing regions, since the repeating identical bases are going to incorporate during the same synthesis cycle, which leads to uncertainty in base calling. The aim of this study was to evaluate the analytical performance of a pyrosequencing-based next-generation sequencing system in detecting homopolymer sequences using homopolymer-preintegrated plasmid constructs and human DNA samples originating from patients with cystic fibrosis. RESULTS: In the plasmid system average correct genotyping was 95.8% in 4-mers, 87.4% in 5-mers and 72.1% in 6-mers. Despite the experienced low genotyping accuracy in 5- and 6-mers, it was possible to generate amplicons with more than a 90% adequate detection rate in every homopolymer tract. When homopolymers in the CFTR gene were sequenced average accuracy was 89.3%, but varied in a wide range (52.2 - 99.1%). In all but one case, an optimal amplicon-sequencing primer combination could be identified. In that single case (7A tract in exon 14 (c.2046_2052)), none of the tested primer sets produced the required analytical performance. CONCLUSIONS: Our results show that pyrosequencing is the most reliable in case of 4-mers and as homopolymer length gradually increases, accuracy deteriorates. With careful primer selection, the NGS system was able to correctly genotype all but one of the homopolymers in the CFTR gene. In conclusion, we configured a plasmid test system that can be used to assess genotyping accuracy of NGS devices and developed an accurate NGS assay for the molecular diagnosis of CF using self-designed primers for amplification and sequencing.
Department of Laboratory Medicine University of Debrecen Nagyerdei krt 98 Debrecen H 4032 Hungary
Division of Clinical Genetics University of Debrecen Nagyerdei krt 98 Debrecen H 4032 Hungary
Genomic Medicine and Bioinformatic Core Facility University of Debrecen Debrecen Hungary
See more in PubMed
Templeton AR, Clark AG, Weiss KM, Nickerson DA, Boerwinkle E, Sing CF. Recombinational and mutational hotspots within the human lipoprotein lipase gene. Am J Hum Genet. 2000;66:69–83. doi: 10.1086/302699. PubMed DOI PMC
Brahmachari SK, Sarkar PS, Raghavan S, Narayan M, Maiti AK. Polypurine/polypyrimidine sequences as cis-acting transcriptional regulators. Gene. 1997;190:17–26. doi: 10.1016/S0378-1119(97)00034-6. PubMed DOI
Dechering KJ, Cuelenaere K, Konings RN, Leunissen JA. Distinct frequency-distributions of homopolymeric DNA tracts in different genomes. Nucleic Acids Res. 1998;26:4056–4062. doi: 10.1093/nar/26.17.4056. PubMed DOI PMC
Toth G, Gaspari Z, Jurka J. Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000;10:967–981. doi: 10.1101/gr.10.7.967. PubMed DOI PMC
Denver DR, Morris K, Kewalramani A, Harris KE, Chow A, Estes S, et al. Abundance, distribution, and mutation rates of homopolymeric nucleotide runs in the genome of Caenorhabditis Elegans. J Mol Evol. 2004;58:584–595. doi: 10.1007/s00239-004-2580-4. PubMed DOI
Nelson HC, Finch JT, Luisi BF, Klug A. The structure of an oligo(dA).Oligo(dT) tract and its biological implications. Nature. 1987;330:221–226. doi: 10.1038/330221a0. PubMed DOI
Sen D, Gilbert W. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature. 1988;334:364–366. doi: 10.1038/334364a0. PubMed DOI
Denver DR, Morris K, Lynch M, Vassilieva LL, Thomas WK. High direct estimate of the mutation rate in the mitochondrial genome of Caenorhabditis Elegans. Science. 2000;289:2342–2344. doi: 10.1126/science.289.5488.2342. PubMed DOI
Kunkel TA. The mutational specificity of DNA polymerases-alpha and -gamma during in vitro DNA synthesis. J Biol Chem. 1985;260:12866–12874. PubMed
Tran HT, Keen JD, Kricker M, Resnick MA, Gordenin DA. Hypermutability of homonucleotide runs in mismatch repair and DNA polymerase proofreading yeast mutants. Mol Cell Biol. 1997;17:2859–2865. doi: 10.1128/MCB.17.5.2859. PubMed DOI PMC
Hyman ED. A new method of sequencing DNA. Anal Biochem. 1988;174:423–436. doi: 10.1016/0003-2697(88)90041-3. PubMed DOI
Ronaghi M, Uhlen M, Nyren P. A sequencing method based on real-time pyrophosphate. Science. 1998;281:363–365. doi: 10.1126/science.281.5375.363. PubMed DOI
Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet. 2010;11:31–46. doi: 10.1038/nrg2626. PubMed DOI
Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M, et al. An integrated semiconductor device enabling non-optical genome sequencing. Nature. 2011;475:348–352. doi: 10.1038/nature10242. PubMed DOI
Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26:1135–1145. doi: 10.1038/nbt1486. PubMed DOI
Balzer S, Malde K, Jonassen I. Systematic exploration of error sources in pyrosequencing flowgram data. Bioinformatics. 2011;27:304–309. doi: 10.1093/bioinformatics/btr251. PubMed DOI PMC
Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM. Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol. 2007;8:143. doi: 10.1186/gb-2007-8-7-r143. PubMed DOI PMC
Seneca S, Vancampenhout K, Van Coster R, Smet J, Lissens W, Vanlander A, et al. Analysis of the whole mitochondrial genome: translation of the ion torrent personal genome machine system to the diagnostic bench? Eur J Hum Genet. 2015;23:41–48. doi: 10.1038/ejhg.2014.49. PubMed DOI PMC
Quince C, Lanzen A, Davenport RJ, Turnbaugh PJ. Removing Noise From Pyrosequenced Amplicons. BMC Bioinformatics. 2011;12(1):38. doi: 10.1186/1471-2105-12-38. PubMed DOI PMC
Bragg L, Stone G, Imelfort M, Hugenholtz P, Tyson GW. Fast, accurate errorcorrection of amplicon pyrosequences using Acacia. Nat Methods. 2012;9(5):425–426. doi: 10.1038/nmeth.1990. PubMed DOI
Saha S, Rajasekaran S. EC: an efficient error correction algorithm for short reads. BMC Bioinformatics. 2015;16(Suppl 17):S2. doi: 10.1186/1471-2105-16-S17-S2. PubMed DOI PMC
Wirawan A, Harris RS, Liu Y, Schmidt B, Schroder J. HECTOR: a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data. BMC Bioinformatics. 2014;15:131. doi: 10.1186/1471-2105-15-131. PubMed DOI PMC
Gaspar JM, Thomas WK. FlowClus: efficiently filtering and denoising pyrosequenced amplicons. BMC Bioinformatics. 2015;16(1):105. doi: 10.1186/s12859-015-0532-1. PubMed DOI PMC
Mysara M, Leys N, Raes J, Monsieurs P. NoDe: a fast error-correction algorithm for pyrosequencing amplicon reads. BMC Bioinformatics. 2015;16(1):88. doi: 10.1186/s12859-015-0520-5. PubMed DOI PMC
Lee B, Moon T, Yoon S, Weissman T, Wang J. DUDE-Seq: Fast, flexible, and robust denoising for targeted amplicon sequencing. PLoS One. 2017;12(7):e0181463. doi: 10.1371/journal.pone.0181463. PubMed DOI PMC
Zeng F, Jiang R, Chen T. PyroHMMsnp: an SNP caller for ion torrent and 454 sequencing data. Nucleic Acids Res. 2013;41:e136. doi: 10.1093/nar/gkt372. PubMed DOI PMC
Yang X, Chockalingam SP, Aluru S. A survey of error-correction methods for next-generation sequencing. Brief Bioinform. 2013;14:56–66. doi: 10.1093/bib/bbs015. PubMed DOI
Feliubadalo L, Lopez-Doriga A, Castellsague E, del Valle J, Menendez M, Tornero E, et al. Next-generation sequencing meets genetic diagnostics: development of a comprehensive workflow for the analysis of BRCA1 and BRCA2 genes. Eur J Hum Genet. 2013;21:864–870. doi: 10.1038/ejhg.2012.270. PubMed DOI PMC
Abou Tayoun AN, Tunkey CD, Pugh TJ, Ross T, Shah M, Lee CC, et al. A comprehensive assay for CFTR mutational analysis using next-generation sequencing. Clin Chem. 2013;59:1481–1488. doi: 10.1373/clinchem.2013.206466. PubMed DOI PMC
Makukh H, Krenkova P, Tyrkus M, Bober L, Hancarova M, Hnateyko O, et al. A high frequency of the cystic fibrosis 2184insA mutation in western Ukraine: genotype-phenotype correlations, relevance for newborn screening and genetic testing. J Cyst Fibros. 2010;9:371–375. doi: 10.1016/j.jcf.2010.06.001. PubMed DOI
Ivady G, Madar L, Nagy B, Gonczi F, Ajzner E, Dzsudzsak E, et al. Distribution of CFTR mutations in eastern Hungarians: relevance to genetic testing and to the introduction of newborn screening for cystic fibrosis. J Cyst Fibros. 2011;10:217–220. doi: 10.1016/j.jcf.2010.12.009. PubMed DOI
Ivády G, Koczok K, Madar L, Gombos E, Toth I, Gyori K, et al. Molecular analysis of cystic fibrosis patients in Hungary – an update to the mutational spectrum. J Med Biochem. 2015;34:46–51. PubMed PMC
Dequeker E, Stuhrmann M, Morris MA, Casals T, Castellani C, Claustres M, et al. Best practice guidelines for molecular genetic diagnosis of cystic fibrosis and CFTR-related disorders--updated European recommendations. Eur J Hum Genet. 2009;17:51–65. doi: 10.1038/ejhg.2008.136. PubMed DOI PMC
Pickrell WO, Rees MI, Chung SK. Next generation sequencing methodologies--an overview. Adv Protein Chem Struct Biol. 2012;89:1–26. doi: 10.1016/B978-0-12-394287-6.00001-X. PubMed DOI
Rizzo JM, Buck MJ. Key principles and clinical applications of “next-generation” DNA sequencing. Cancer Prev Res (Phila) 2012;5:887–900. doi: 10.1158/1940-6207.CAPR-11-0432. PubMed DOI
Harrington CT, Lin EI, Olson MT, Eshleman JR. Fundamentals of pyrosequencing. Arch Pathol Lab Med. 2013;137:1296–1303. doi: 10.5858/arpa.2012-0463-RA. PubMed DOI
Ronaghi M. Improved performance of pyrosequencing using single-stranded DNA-binding protein. Anal Biochem. 2000;286:282–288. doi: 10.1006/abio.2000.4808. PubMed DOI
Ahmadian A, Ehn M, Hober S. Pyrosequencing: history, biochemistry and future. Clin Chim Acta. 2006;363:83–94. doi: 10.1016/j.cccn.2005.04.038. PubMed DOI
Deng W, Maust BS, Westfall DH, Chen L, Zhao H, Larsen BB, et al. Indel and carryforward correction (ICC): a new analysis approach for processing 454 pyrosequencing data. Bioinformatics. 2013;29:2402–2409. doi: 10.1093/bioinformatics/btt434. PubMed DOI PMC
Iyer S, Bouzek H, Deng W, Larsen B, Casey E, Mullins JI. Quality score based identification and correction of pyrosequencing errors. PLoS One. 2013;8:e73015. doi: 10.1371/journal.pone.0073015. PubMed DOI PMC
Beuf KD, Schrijver JD, Thas O, Criekinge WV, Irizarry RA, Clement L. Improved base-calling and quality scores for 454 sequencing based on a hurdle Poisson model. BMC Bioinformatics. 2012;13:303. doi: 10.1186/1471-2105-13-303. PubMed DOI PMC
Gilles A, Meglecz E, Pech N, Ferreira S, Malausa T, Martin JF. Accuracy and quality assessment of 454 GS-FLX titanium pyrosequencing. BMC Genomics. 2011;12:245. doi: 10.1186/1471-2164-12-245. PubMed DOI PMC