• This record comes from PubMed

Analytical parameters and validation of homopolymer detection in a pyrosequencing-based next generation sequencing system

. 2018 Feb 21 ; 19 (1) : 158. [epub] 20180221

Language English Country England, Great Britain Media electronic

Document type Journal Article, Research Support, Non-U.S. Gov't, Validation Study

Grant support
K109076 Országos Tudományos Kutatási Alapprogramok - International
00064203 Ministerstvo Zdravotnictví Ceské Republiky - International
CZ.2.16/3.1.00/24022OPPK European Regional Development Fund Prague - International
NF-CZ11-PDP-3-003-2014, LM2015091 and COST-LD14073 Norway Grants - International
UNKP-16-3 IV/4 New National Excellence Program of the Ministry of Human Capacities - International
GINOP-2.3.2-15-2016-00039 Ministry of National Economy, Hungary - International

Links

PubMed 29466940
PubMed Central PMC5822529
DOI 10.1186/s12864-018-4544-x
PII: 10.1186/s12864-018-4544-x
Knihovny.cz E-resources

BACKGROUND: Current technologies in next-generation sequencing are offering high throughput reads at low costs, but still suffer from various sequencing errors. Although pyro- and ion semiconductor sequencing both have the advantage of delivering long and high quality reads, problems might occur when sequencing homopolymer-containing regions, since the repeating identical bases are going to incorporate during the same synthesis cycle, which leads to uncertainty in base calling. The aim of this study was to evaluate the analytical performance of a pyrosequencing-based next-generation sequencing system in detecting homopolymer sequences using homopolymer-preintegrated plasmid constructs and human DNA samples originating from patients with cystic fibrosis. RESULTS: In the plasmid system average correct genotyping was 95.8% in 4-mers, 87.4% in 5-mers and 72.1% in 6-mers. Despite the experienced low genotyping accuracy in 5- and 6-mers, it was possible to generate amplicons with more than a 90% adequate detection rate in every homopolymer tract. When homopolymers in the CFTR gene were sequenced average accuracy was 89.3%, but varied in a wide range (52.2 - 99.1%). In all but one case, an optimal amplicon-sequencing primer combination could be identified. In that single case (7A tract in exon 14 (c.2046_2052)), none of the tested primer sets produced the required analytical performance. CONCLUSIONS: Our results show that pyrosequencing is the most reliable in case of 4-mers and as homopolymer length gradually increases, accuracy deteriorates. With careful primer selection, the NGS system was able to correctly genotype all but one of the homopolymers in the CFTR gene. In conclusion, we configured a plasmid test system that can be used to assess genotyping accuracy of NGS devices and developed an accurate NGS assay for the molecular diagnosis of CF using self-designed primers for amplification and sequencing.

See more in PubMed

Templeton AR, Clark AG, Weiss KM, Nickerson DA, Boerwinkle E, Sing CF. Recombinational and mutational hotspots within the human lipoprotein lipase gene. Am J Hum Genet. 2000;66:69–83. doi: 10.1086/302699. PubMed DOI PMC

Brahmachari SK, Sarkar PS, Raghavan S, Narayan M, Maiti AK. Polypurine/polypyrimidine sequences as cis-acting transcriptional regulators. Gene. 1997;190:17–26. doi: 10.1016/S0378-1119(97)00034-6. PubMed DOI

Dechering KJ, Cuelenaere K, Konings RN, Leunissen JA. Distinct frequency-distributions of homopolymeric DNA tracts in different genomes. Nucleic Acids Res. 1998;26:4056–4062. doi: 10.1093/nar/26.17.4056. PubMed DOI PMC

Toth G, Gaspari Z, Jurka J. Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000;10:967–981. doi: 10.1101/gr.10.7.967. PubMed DOI PMC

Denver DR, Morris K, Kewalramani A, Harris KE, Chow A, Estes S, et al. Abundance, distribution, and mutation rates of homopolymeric nucleotide runs in the genome of Caenorhabditis Elegans. J Mol Evol. 2004;58:584–595. doi: 10.1007/s00239-004-2580-4. PubMed DOI

Nelson HC, Finch JT, Luisi BF, Klug A. The structure of an oligo(dA).Oligo(dT) tract and its biological implications. Nature. 1987;330:221–226. doi: 10.1038/330221a0. PubMed DOI

Sen D, Gilbert W. Formation of parallel four-stranded complexes by guanine-rich motifs in DNA and its implications for meiosis. Nature. 1988;334:364–366. doi: 10.1038/334364a0. PubMed DOI

Denver DR, Morris K, Lynch M, Vassilieva LL, Thomas WK. High direct estimate of the mutation rate in the mitochondrial genome of Caenorhabditis Elegans. Science. 2000;289:2342–2344. doi: 10.1126/science.289.5488.2342. PubMed DOI

Kunkel TA. The mutational specificity of DNA polymerases-alpha and -gamma during in vitro DNA synthesis. J Biol Chem. 1985;260:12866–12874. PubMed

Tran HT, Keen JD, Kricker M, Resnick MA, Gordenin DA. Hypermutability of homonucleotide runs in mismatch repair and DNA polymerase proofreading yeast mutants. Mol Cell Biol. 1997;17:2859–2865. doi: 10.1128/MCB.17.5.2859. PubMed DOI PMC

Hyman ED. A new method of sequencing DNA. Anal Biochem. 1988;174:423–436. doi: 10.1016/0003-2697(88)90041-3. PubMed DOI

Ronaghi M, Uhlen M, Nyren P. A sequencing method based on real-time pyrophosphate. Science. 1998;281:363–365. doi: 10.1126/science.281.5375.363. PubMed DOI

Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet. 2010;11:31–46. doi: 10.1038/nrg2626. PubMed DOI

Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M, et al. An integrated semiconductor device enabling non-optical genome sequencing. Nature. 2011;475:348–352. doi: 10.1038/nature10242. PubMed DOI

Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26:1135–1145. doi: 10.1038/nbt1486. PubMed DOI

Balzer S, Malde K, Jonassen I. Systematic exploration of error sources in pyrosequencing flowgram data. Bioinformatics. 2011;27:304–309. doi: 10.1093/bioinformatics/btr251. PubMed DOI PMC

Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM. Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol. 2007;8:143. doi: 10.1186/gb-2007-8-7-r143. PubMed DOI PMC

Seneca S, Vancampenhout K, Van Coster R, Smet J, Lissens W, Vanlander A, et al. Analysis of the whole mitochondrial genome: translation of the ion torrent personal genome machine system to the diagnostic bench? Eur J Hum Genet. 2015;23:41–48. doi: 10.1038/ejhg.2014.49. PubMed DOI PMC

Quince C, Lanzen A, Davenport RJ, Turnbaugh PJ. Removing Noise From Pyrosequenced Amplicons. BMC Bioinformatics. 2011;12(1):38. doi: 10.1186/1471-2105-12-38. PubMed DOI PMC

Bragg L, Stone G, Imelfort M, Hugenholtz P, Tyson GW. Fast, accurate errorcorrection of amplicon pyrosequences using Acacia. Nat Methods. 2012;9(5):425–426. doi: 10.1038/nmeth.1990. PubMed DOI

Saha S, Rajasekaran S. EC: an efficient error correction algorithm for short reads. BMC Bioinformatics. 2015;16(Suppl 17):S2. doi: 10.1186/1471-2105-16-S17-S2. PubMed DOI PMC

Wirawan A, Harris RS, Liu Y, Schmidt B, Schroder J. HECTOR: a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data. BMC Bioinformatics. 2014;15:131. doi: 10.1186/1471-2105-15-131. PubMed DOI PMC

Gaspar JM, Thomas WK. FlowClus: efficiently filtering and denoising pyrosequenced amplicons. BMC Bioinformatics. 2015;16(1):105. doi: 10.1186/s12859-015-0532-1. PubMed DOI PMC

Mysara M, Leys N, Raes J, Monsieurs P. NoDe: a fast error-correction algorithm for pyrosequencing amplicon reads. BMC Bioinformatics. 2015;16(1):88. doi: 10.1186/s12859-015-0520-5. PubMed DOI PMC

Lee B, Moon T, Yoon S, Weissman T, Wang J. DUDE-Seq: Fast, flexible, and robust denoising for targeted amplicon sequencing. PLoS One. 2017;12(7):e0181463. doi: 10.1371/journal.pone.0181463. PubMed DOI PMC

Zeng F, Jiang R, Chen T. PyroHMMsnp: an SNP caller for ion torrent and 454 sequencing data. Nucleic Acids Res. 2013;41:e136. doi: 10.1093/nar/gkt372. PubMed DOI PMC

Yang X, Chockalingam SP, Aluru S. A survey of error-correction methods for next-generation sequencing. Brief Bioinform. 2013;14:56–66. doi: 10.1093/bib/bbs015. PubMed DOI

Feliubadalo L, Lopez-Doriga A, Castellsague E, del Valle J, Menendez M, Tornero E, et al. Next-generation sequencing meets genetic diagnostics: development of a comprehensive workflow for the analysis of BRCA1 and BRCA2 genes. Eur J Hum Genet. 2013;21:864–870. doi: 10.1038/ejhg.2012.270. PubMed DOI PMC

Abou Tayoun AN, Tunkey CD, Pugh TJ, Ross T, Shah M, Lee CC, et al. A comprehensive assay for CFTR mutational analysis using next-generation sequencing. Clin Chem. 2013;59:1481–1488. doi: 10.1373/clinchem.2013.206466. PubMed DOI PMC

Makukh H, Krenkova P, Tyrkus M, Bober L, Hancarova M, Hnateyko O, et al. A high frequency of the cystic fibrosis 2184insA mutation in western Ukraine: genotype-phenotype correlations, relevance for newborn screening and genetic testing. J Cyst Fibros. 2010;9:371–375. doi: 10.1016/j.jcf.2010.06.001. PubMed DOI

Ivady G, Madar L, Nagy B, Gonczi F, Ajzner E, Dzsudzsak E, et al. Distribution of CFTR mutations in eastern Hungarians: relevance to genetic testing and to the introduction of newborn screening for cystic fibrosis. J Cyst Fibros. 2011;10:217–220. doi: 10.1016/j.jcf.2010.12.009. PubMed DOI

Ivády G, Koczok K, Madar L, Gombos E, Toth I, Gyori K, et al. Molecular analysis of cystic fibrosis patients in Hungary – an update to the mutational spectrum. J Med Biochem. 2015;34:46–51. PubMed PMC

Dequeker E, Stuhrmann M, Morris MA, Casals T, Castellani C, Claustres M, et al. Best practice guidelines for molecular genetic diagnosis of cystic fibrosis and CFTR-related disorders--updated European recommendations. Eur J Hum Genet. 2009;17:51–65. doi: 10.1038/ejhg.2008.136. PubMed DOI PMC

Pickrell WO, Rees MI, Chung SK. Next generation sequencing methodologies--an overview. Adv Protein Chem Struct Biol. 2012;89:1–26. doi: 10.1016/B978-0-12-394287-6.00001-X. PubMed DOI

Rizzo JM, Buck MJ. Key principles and clinical applications of “next-generation” DNA sequencing. Cancer Prev Res (Phila) 2012;5:887–900. doi: 10.1158/1940-6207.CAPR-11-0432. PubMed DOI

Harrington CT, Lin EI, Olson MT, Eshleman JR. Fundamentals of pyrosequencing. Arch Pathol Lab Med. 2013;137:1296–1303. doi: 10.5858/arpa.2012-0463-RA. PubMed DOI

Ronaghi M. Improved performance of pyrosequencing using single-stranded DNA-binding protein. Anal Biochem. 2000;286:282–288. doi: 10.1006/abio.2000.4808. PubMed DOI

Ahmadian A, Ehn M, Hober S. Pyrosequencing: history, biochemistry and future. Clin Chim Acta. 2006;363:83–94. doi: 10.1016/j.cccn.2005.04.038. PubMed DOI

Deng W, Maust BS, Westfall DH, Chen L, Zhao H, Larsen BB, et al. Indel and carryforward correction (ICC): a new analysis approach for processing 454 pyrosequencing data. Bioinformatics. 2013;29:2402–2409. doi: 10.1093/bioinformatics/btt434. PubMed DOI PMC

Iyer S, Bouzek H, Deng W, Larsen B, Casey E, Mullins JI. Quality score based identification and correction of pyrosequencing errors. PLoS One. 2013;8:e73015. doi: 10.1371/journal.pone.0073015. PubMed DOI PMC

Beuf KD, Schrijver JD, Thas O, Criekinge WV, Irizarry RA, Clement L. Improved base-calling and quality scores for 454 sequencing based on a hurdle Poisson model. BMC Bioinformatics. 2012;13:303. doi: 10.1186/1471-2105-13-303. PubMed DOI PMC

Gilles A, Meglecz E, Pech N, Ferreira S, Malausa T, Martin JF. Accuracy and quality assessment of 454 GS-FLX titanium pyrosequencing. BMC Genomics. 2011;12:245. doi: 10.1186/1471-2164-12-245. PubMed DOI PMC

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...