Identification of highly variable sequence fragments in unmapped reads for rapid bacterial genotyping
Jazyk angličtina Země Velká Británie, Anglie Médium electronic
Typ dokumentu časopisecké články
Grantová podpora
NV19-09-00430
Ministerstvo Zdravotnictv? Cesk? Republiky
PubMed
36581824
PubMed Central
PMC9798552
DOI
10.1186/s12864-022-08550-4
PII: 10.1186/s12864-022-08550-4
Knihovny.cz E-zdroje
- Klíčová slova
- Bacterial genotyping, De novo assembly, Genome assembly, Mini-MLST, Multilocus sequence typing, Unmapped reads,
- MeSH
- Bacteria * genetika MeSH
- Escherichia coli genetika MeSH
- genom * MeSH
- genotyp MeSH
- multilokusová sekvenční typizace metody MeSH
- techniky typizace bakterií metody MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Bacterial genotyping is a crucial process in outbreak investigation and epidemiological studies. Several typing methods such as pulsed-field gel electrophoresis, multilocus sequence typing (MLST) and whole genome sequencing are currently used in routine clinical practice. However, these methods are costly, time-consuming and have high computational demands. An alternative to these methods is mini-MLST, a quick, cost-effective and robust method based on high-resolution melting analysis. Nevertheless, no standardized approach to identify markers suitable for mini-MLST exists. Here, we present a pipeline for variable fragment detection in unmapped reads based on a modified hybrid assembly approach using data from one sequencing platform. RESULTS: In routine assembly against the reference sequence, high variable reads are not aligned and remain unmapped. If de novo assembly of them is performed, variable genomic regions can be located in created scaffolds. Based on the variability rates calculation, it is possible to find a highly variable region with the same discriminatory power as seven housekeeping gene fragments used in MLST. In the work presented here, we show the capability of identifying one variable fragment in de novo assembled scaffolds of 21 Escherichia coli genomes and three variable regions in scaffolds of 31 Klebsiella pneumoniae genomes. For each identified fragment, the melting temperatures are calculated based on the nearest neighbor method to verify the mini-MLST's discriminatory power. CONCLUSIONS: A pipeline for a modified hybrid assembly approach consisting of reference-based mapping and de novo assembly of unmapped reads is presented. This approach can be employed for the identification of highly variable genomic fragments in unmapped reads. The identified variable regions can then be used in efficient laboratory methods for bacterial typing such as mini-MLST with high discriminatory power, fully replacing expensive methods such as MLST. The results can and will be delivered in a shorter time, which allows immediate and fast infection monitoring in clinical practice.
Zobrazit více v PubMed
Li W, Raoult D, Fournier P-E. Bacterial strain typing in the genomic era. FEMS Microbiol Rev. 2009;33(5):892–916. doi: 10.1111/j.1574-6976.2009.00182.x. PubMed DOI
Neoh H. -m., Tan X-E, Sapri HF, Tan TL. Pulsed-field gel electrophoresis (PFGE): A review of the “gold standard” for bacteria typing and current alternatives. Infect Genet Evol. 2019;74(March):103935. doi: 10.1016/j.meegid.2019.103935. PubMed DOI
Sabat AJ, Budimir A, Nashev D, Sá-Leão R, van Dijl JM, Laurent F, Grundmann H, Friedrich AW, on behalf of the ESCMID Study Group Overview of molecular typing methods for outbreak detection and epidemiological surveillance. Eurosurveillance. 2013;18(4):20380. doi: 10.2807/ese.18.04.20380-en. PubMed DOI
Enright MC, Spratt BG. Multilocus sequence typing. Trends Microbiol. 1999;7(12):482–7. doi: 10.1016/S0966-842X(99)01609-1. PubMed DOI
Urwin R, Maiden MCJ. Multi-locus sequence typing: a tool for global epidemiology. Trends Microbiol. 2003;11(10):479–87. doi: 10.1016/j.tim.2003.08.006. PubMed DOI
Tong SYC, Giffard PM. Microbiological Applications of High-Resolution Melting Analysis. J Clin Microbiol. 2012;50(11):3418–21. doi: 10.1128/JCM.01709-12. PubMed DOI PMC
Andersson P, Tong SYC, Bell JM, Turnidge JD, Giffard PM. Minim Typing – A Rapid and Low Cost MLST Based Typing Tool for Klebsiella pneumoniae. PLoS ONE. 2012;7(3):33530. doi: 10.1371/journal.pone.0033530. PubMed DOI PMC
Brhelova E, Kocmanova I, Racil Z, Hanslianova M, Antonova M, Mayer J, Lengerova M. Validation of Minim typing for fast and accurate discrimination of extended-spectrum, beta-lactamase-producing Klebsiella pneumoniae isolates in tertiary care hospital. Diagn Microbiol Infect Dis. 2016;86(1):44–9. doi: 10.1016/j.diagmicrobio.2016.03.010. PubMed DOI
Bezdicek M, Nykrynova M, Plevova K, Brhelova E, Kocmanova I, Sedlar K, Racil Z, Mayer J, Lengerova M. Application of mini-MLST and whole genome sequencing in low diversity hospital extended-spectrum beta-lactamase producing Klebsiella pneumoniae population. PLoS ONE. 2019;14(8):0221187. doi: 10.1371/journal.pone.0221187. PubMed DOI PMC
Paszkiewicz K, Studholme DJ. De novo assembly of short sequence reads. Brief Bioinforma. 2010;11(5):457–72. doi: 10.1093/bib/bbq020. PubMed DOI
Liao X, Li M, Zou Y, Wu F-X, Yi-Pan. Wang J. Current challenges and solutions of de novo assembly. Quant Biol. 2019;7(2):90–109. doi: 10.1007/s40484-019-0166-9. DOI
Abnizova I, te Boekhorst R, Orlov YL. Computational Errors and Biases in Short Read Next Generation Sequencing. J Proteomics Bioinforma. 2017;10(1):1–17. doi: 10.4172/jpb.1000420. DOI
Larsen PA, Harris RA, Liu Y, Murali SC, Campbell CR, Brown AD, Sullivan BA, Shelton J, Brown SJ, Raveendran M, Dudchenko O, Machol I, Durand NC, Shamim MS, Aiden EL, Muzny DM, Gibbs RA, Yoder AD, Rogers J, Worley KC. Hybrid de novo genome assembly and centromere characterization of the gray mouse lemur (Microcebus murinus) BMC Biol. 2017;15(1):110. doi: 10.1186/s12915-017-0439-6. PubMed DOI PMC
Besser J, Carleton HA, Gerner-Smidt P, Lindsey RL, Trees E. Next-generation sequencing technologies and their application to the study and control of bacterial infections. Clin Microbiol Infect. 2018;24(4):335–41. doi: 10.1016/j.cmi.2017.10.013. PubMed DOI PMC
Nataro JP, Kaper JB. Diarrheagenic Escherichia coli. Clin Microbiol Rev. 1998;11(2):403. doi: 10.1128/CMR.11.2.403. PubMed DOI PMC
Liu B, Furevi A, Perepelov AV, Guo X, Cao H, Wang Q, Reeves PR, Knirel YA, Wang L, Widmalm G. Structure and genetics of Escherichia coli O antigens. FEMS Microbiol Rev. 2020;44(6):655–83. doi: 10.1093/femsre/fuz028. PubMed DOI PMC
Kaper JB, Nataro JP, Mobley HLT. Pathogenic Escherichia coli. Nat Rev Microbiol. 2004;2(2):123–40. doi: 10.1038/nrmicro818. PubMed DOI
Touchon M, Perrin A, de Sousa JAM, Vangchhia B, Burn S, O’Brien CL, Denamur E, Gordon D, Rocha EPC. Phylogenetic background and habitat drive the genetic diversification of Escherichia coli. PLoS Genet. 2020;16(6):1008866. doi: 10.1371/journal.pgen.1008866. PubMed DOI PMC
Li B, Zhao Y, Liu C, Chen Z, Zhou D. Molecular pathogenesis of Klebsiella pneumoniae. Futur Microbiol. 2014;9(9):1071–81. doi: 10.2217/fmb.14.48. PubMed DOI
Bengoechea JA, Sa Pessoa J. Klebsiella pneumoniae infection biology: living to counteract host defences. FEMS Microbiol Rev. 2019;43(2):123–44. doi: 10.1093/femsre/fuy043. PubMed DOI PMC
Paczosa MK, Mecsas J. Klebsiella pneumoniae: Going on the Offense with a Strong Defense. Microbiol Mol Biol Rev. 2016;80(3):629–61. doi: 10.1128/MMBR.00078-15. PubMed DOI PMC
Wyres KL, Holt KE. Klebsiella pneumoniae Population Genomics and Antimicrobial-Resistant Clones. Trends Microbiol. 2016;24(12):944–56. doi: 10.1016/j.tim.2016.09.007. PubMed DOI
Nykrynova M, Barton V, Sedlar K, Bezdicek M, Lengerova M, Skutkova H. Word Entropy-Based Approach to Detect Highly Variable Genetic Markers for Bacterial Genotyping. Front Microbiol. 2021;12(February):1–8. PubMed PMC
Borer PN, Dengler B, Tinoco I, Uhlenbeck OC. Stability of ribonucleic acid double-stranded helices. J Mol Biol. 1974;86(4):843–53. doi: 10.1016/0022-2836(74)90357-X. PubMed DOI
Andrews S. FastQC: a quality control tool for high throughput sequence data. 2010. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed Jan 2020.
Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32(19):3047–8. doi: 10.1093/bioinformatics/btw354. PubMed DOI PMC
Bushnell B, et al. BBMap: A Fast, Accurate, Splice-Aware Aligner. No. LBNL-7065E. Berkeley: Ernest Orlando Lawrence Berkeley National Laboratory; 2014.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. doi: 10.1093/bioinformatics/btu170. PubMed DOI PMC
Hayashi T. Complete Genome Sequence of Enterohemorrhagic Eschelichia coli O157:H7 and Genomic Comparison with a Laboratory Strain K-12. DNA Res. 2001;8(1):11–22. doi: 10.1093/dnares/8.1.11. PubMed DOI
Wu KM, Li NH, Yan JJ, Tsao N, Liao TL, Tsai HC, Fung CP, Chen HJ, Liu YM, Wang JT, Fang CT, Chang SC, Shu HY, Liu TT, Chen YT, Shiau YR, Lauderdale TL, Su IJ, Kirby R, Tsai SF. Genome sequencing and comparative analysis of Klebsiella pneumoniae NTUH-K2044, a strain causing liver abscess and meningitis. J Bacteriol. 2009;191(14):4492–501. doi: 10.1128/JB.00315-09. PubMed DOI PMC
O’Leary NA, Wright MW, Brister JR, Ciufo S, Haddad D, McVeigh R, Rajput B, Robbertse B, Smith-White B, Ako-Adjei D, Astashyn A, Badretdin A, Bao Y, Blinkova O, Brover V, Chetvernin V, Choi J, Cox E, Ermolaeva O, Farrell CM, Goldfarb T, Gupta T, Haft D, Hatcher E, Hlavina W, Joardar VS, Kodali VK, Li W, Maglott D, Masterson P, McGarvey KM, Murphy MR, O’Neill K, Pujar S, Rangwala SH, Rausch D, Riddick LD, Schoch C, Shkeda A, Storz SS, Sun H, Thibaud-Nissen F, Tolstoy I, Tully RE, Vatsan AR, Wallin C, Webb D, Wu W, Landrum MJ, Kimchi A, Tatusova T, DiCuccio M, Kitts P, Murphy TD, Pruitt KD. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 2016;44(D1):733–45. doi: 10.1093/nar/gkv1189. PubMed DOI PMC
Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. 2013. http://arxiv.org/abs/1303.3997. Accessed Jan 2020.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. doi: 10.1093/bioinformatics/btp352. PubMed DOI PMC
Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes De Novo Assembler. Curr Protoc Bioinforma. 2020;70(1):1–29. doi: 10.1002/cpbi.102. PubMed DOI
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10(1):421. doi: 10.1186/1471-2105-10-421. PubMed DOI PMC
Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol. 1980;16(2):111–20. doi: 10.1007/BF01731581. PubMed DOI
Kibbe WA. OligoCalc: an online oligonucleotide properties calculator. Nucleic Acids Res. 2007;35(Web Server):43–6. doi: 10.1093/nar/gkm234. PubMed DOI PMC
Subramanian B, Gao S, Lercher MJ, Hu S, Chen W-H. Evolview v3: a webserver for visualization, annotation, and management of phylogenetic trees. Nucleic Acids Res. 2019;47(W1):270–5. doi: 10.1093/nar/gkz357. PubMed DOI PMC