Rapid Bacterial Species Delineation Based on Parameters Derived From Genome Numerical Representations
Status PubMed-not-MEDLINE Jazyk angličtina Země Nizozemsko Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
30728919
PubMed Central
PMC6352304
DOI
10.1016/j.csbj.2018.12.006
PII: S2001-0370(18)30122-3
Knihovny.cz E-zdroje
- Klíčová slova
- Bacterial genome, Comparative genomics, Genomic signal processing, Numerical representation, Species delineation,
- Publikační typ
- časopisecké články MeSH
Species delineation based on bacterial genomes is an essential part of the research of prokaryotes. In silico genome-to-genome comparison methods are computationally demanding, but much less tedious and error prone than the wet-lab methods. In this paper, we present a novel method for the delineation of bacterial genomes based on genomic signal processing. The proposed method uses numerical representations of whole bacterial genomes, phase signal and cumulated phase signal, from which four parameters are derived for each genome. The parameters characterize a genome and their calculation is independent of the other genomes comprising a delineation dataset. The delineation itself is processed as a calculation of the parameters' average similarity. The method was statistically verified on 1826 bacterial genomes. A similarity threshold of 96% was set based on the receiver operating characteristic curve that featured sensitivity of 99.78% and specificity of 97.25%. Additionally, comparative analysis on another 33 bacterial genomes was conducted using standard delineation tools as these tools were not able to process the dataset of 1826 genomes using desktop computer. The proposed method achieved comparable or better delineation results in comparison with the standard tools. Besides the excellent delineation results, another great advantage of the method is its small computational demands, which enables the delineation of thousands of genomes on a desktop computer. The calculation of the parameters takes tens of minutes for thousands of genomes. Moreover, they can be calculated in advance by creating a database, meaning the delineation itself is then completed in a matter of seconds.
Zobrazit více v PubMed
Zachos F.E. Taxonomy: species splitting puts conservation at risk. Nature. 2013;494(7435):35. PubMed
Simon C., Daniel R. Metagenomic analyses: past and future trends. Appl Environ Microbiol. 2011;77(4):1153–1161. PubMed PMC
Prakash O., Verma M., Sharma P., Kumar M., Kumari K. Polyphasic approach of bacterial classification - an overview of recent advances. Indian J Microbiol. 2007;47(2):98–108. PubMed PMC
Maiden M.C., Jansen Van Rensburg M.J., Bray J.E., Earle S.G., Ford S.A. MLST revisited: the gene-by-gene approach to bacterial genomics. Nat Rev Microbiol. 2013;11(10):728–736. PubMed PMC
Powell W., Morgante M., Andre C., Hanafey M., Vogel J. The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) markers for germplasm analysis. Mol Breed. 1996;2(3):225–238.
Hardys H., Balick M., Schierwater B. Applications of random amplified polymorphic DNA (RAPD) in molecular ecology. Mol Ecol. 1992;1(1):55–63. PubMed
Brhelova E., Kocmanova I., Racil Z., Hanslianova M., Antonova M. Validation of minim typing for fast and accurate discrimination of extended-spectrum, beta-lactamase-producing Klebsiella pneumoniae isolates in tertiary care hospital. Diagn Microbiol Infect Dis. 2016;86(1):44–49. PubMed
Wayne L.G., Brenner D.J., Colwell R.R., Grimont P.A.D., Kandler O. Report of the Ad Hoc committee on reconciliation of approaches to bacterial systematics. Int J Syst Evol Microbiol. 1987;37:463–464.
Konstantinidis K.T., Tiedje J.M. Genomic insights that advance the species definition for prokaryotes. Proc Natl Acad Sci U S A. 2005;102(7):2567–2572. PubMed PMC
Auch A.F., von Jan M., Klenk H.P., Göker M. Digital DNA-DNA hybridization for microbial species delineation by means of genome-to-genome sequence comparison. Stand Genomic Sci. 2010;2(1):117–134. PubMed PMC
Altschul S.F., Madden T.L., Schäffer A.A., Zhang J., Zhang Z. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25(17):3389–3402. PubMed PMC
Deloger M., El Karoui M., Petit M.A. A genomic distance based on MUM indicates discontinuity between most bacterial species and genera. J Bacteriol. 2009;191(1):91–99. PubMed PMC
Kurtz S., Phillippy A., Delcher A.L., Smoot M., Shumway M. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12. PubMed PMC
Goris J., Konstantinidis K.T., Klappenbach J.A., Coenye T., Vandamme P., Tiedje J.M. DNA-DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57(1):81–91. PubMed
Figueras M.J., Beaz-Hidalgo R., Hossain M.J., Liles M.R. Taxonomic affiliation of new genomes should be verified using average nucleotide identity and multilocus phylogenetic analysis. Genome Announc. 2014;2(6) (e00927-14) PubMed PMC
Richter M., Rosselló-Móra R., Glöckner F.O., Peplies J. JSpeciesWS: a web server for prokaryotic species circumscription based on pairwise genome comparison. Bioinformatics. 2015;32(6):929–931. PubMed PMC
Lee I., Kim Y.O., Park S.C., Chun J. OrthoANI: an improved algorithm and software for calculating average nucleotide identity. Int J Syst Evol Microbiol. 2016;66:1100–1103. PubMed
Auch A.F., Klenk H.P., Göker M. Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs. Stand Genomic Sci. 2010;2(1):142–148. PubMed PMC
Han N., Qiang Y., Zhang W. ANItools web: a web tool for fast genome comparison within multiple bacterial strains. Database (Oxford) 2016 PubMed PMC
Olm M.R., Brown C.T., Brooks B., Banfield J.F. dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication. ISME J. 2017;11:2864–2868. PubMed PMC
Varghese N.J., Mukherjee S., Ivanova N., Konstantinidis K.T., Mavrommatis K., Kyrpides N.C. Microbial species delineation using whole genome sequences. Nucleic Acids Res. 2015;43(14):6761–6771. PubMed PMC
Cristea P.D. Conversion of nucleotides sequences into genomic signals. J Cell Mol Med. 2002;6(2):279–303. PubMed PMC
Skutkova H., Vitek M., Babula P., Kizek R., Provaznik I. Classification of genomic signals using dynamic time warping. BMC Bioinforma. 2013;14(Suppl10):S1. PubMed PMC
Skutkova H., Vitek M., Sedlar K., Provaznik I. Progressive alignment of genomic signals by multiple dynamic time warping. J Theor Biol. 2015;385:20–30. PubMed
Sedlar K., Skutkova H., Vitek M., Provaznik I. Set of rules for genomic signal downsampling. Comput Biol Med. 2016;69:308–314. PubMed
Mendizabal-Ruiz G., Román-Godínez I., Torres-Ramos S., Salido-Ruiz R.A., Morales J.A. On DNA numerical representations for genomic similarity computation. PLoS One. 2017;12(3) PubMed PMC
Bielińska-Wąż D. Graphical and numerical representations of DNA sequences: statistical aspects of similarity. J Math Chem. 2011;49:2345. PubMed PMC
Cristea P.D. Representation and analysis of DNA sequences. In: Dougherty E.R., Shmulevich I., Chen J., Wang Z.J., editors. Genome signal processing and statistics. Hindawi Publising Corporation; New York: 2005. pp. 15–65.
Maderankova D., Sedlar K., Vitek M., Skutkova H. 2017 IEEE conference on computational intelligence in bioinformatics and computational biology (CIBCB) 2017. The identification of replication origin in bacterial genomes by cumulated phase signal; pp. 1–5.
Richter M., Roselló-Móra R. Shifting the genomic gold standard for the prokaryotic species definition. Proc Natl Acad Sci U S A. 2009;106(45):19126–19131. PubMed PMC
Meier-Kolthoff J.P., Auch A.F., Klenk H.P., Göker M. Genome sequence-based species delimitation with confidence intervals and improved distance functions. BMC Bioinform. 2013;14:60. PubMed PMC
Teeling H., Meyerdierks A., Bauer M., Amann R., Glöckner F.O. Application of tetranucleotide frequencies for the assignment of genomic fragments. Environ Microbiol. 2004;6(9):938–947. PubMed
Rodriguez-R L.M., Konstantinidis K.T. Bypassing cultivation to identify bacterial species. Microbe. 2014;9(3):111–118.
Lan R., Reeves P.R. Escherichia coli in disguise: molecular origins of Shigella. Microbes Infect. 2002;4(11):1125–1132. PubMed
Pettengill E.A., Pettengill J.B., Binet R. Phylogenetic analyses of Shigella and enteroinvasive Escherichia coli for the identification of molecular epidemiological markers: whole-genome comparative analysis does not support distinct genera designation. Front Microbiol. 2016;6(1573) PubMed PMC