Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding

. 2018 ; (39) : 29-40. [epub] 20180911

Status PubMed-not-MEDLINE Jazyk angličtina Země Bulharsko Médium electronic-ecollection

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/pmid30271256

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appears to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon dataset. We conclude that the output of each platform requires manual validation of the OTUs by examining the taxonomy assignment values.

Zobrazit více v PubMed

Abarenkov K, Nilsson RH, Larsson K-H, Alexander IJ, Eberhardt U, Erland S, Hoiland K, Kjoller R, Larsson E, Pennanen T, Sen R, Taylor AFS, Tedersoo L, Ursing BM, Vralstad T, Liimatainen K, Peintner U, Kõljalg U. (2010a) The UNITE database for molecular identification of fungi – recent updates and future perspectives. New Phytologist 186: 281–285. 10.1111/j.1469-8137.2009.03160.x PubMed DOI

Abarenkov K, Tedersoo L, Nilsson RH, Vellak K, Saar I, Veldre V, Parmasto E, Prous M, Aan A, Ots M, Kurina O, Ostonen I, Jogeva J, Halapuu S, Poldmaa K, Toots M, Truu J, Larsson K-H, Koljalg U. (2010b) PlutoF-a Web Based Workbench for Ecological and Taxonomic Research, with an Online Implementation for Fungal ITS Sequences. Evolutionary Bioinformatics 6: 189–196. 10.4137/ebo.s6271 DOI

Afgan E, Baker D, Van den Beek M, Blankenberg D, Bouvier D, Čech M, Chilton J, Clements D, Coraor N, Eberhard C. (2016) The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Research 44: 3–10. 10.1093/nar/gkw343 PubMed DOI PMC

Anderson MJ, Walsh DCI. (2013) PERMANOVA, ANOSIM, and the Mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? Ecological Monographs 83: 557–574. 10.1890/12-2010.1 DOI

Anslan S, Bahram M, Hiiesalu I, Tedersoo L. (2017) PipeCraft: flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data. Molecular Ecology Resources 17: e234–e240. 10.1111/1755-0998.12692 PubMed DOI

Anslan S, Bahram M, Tedersoo L. (2018) Seasonal and annual variation in fungal communities associated with epigeic springtails (Collembola spp.) in boreal forests. Soil Biology and Biochemistry 116: 245–252. doi:10.1016/j.soilbio.2017.10.021 DOI

Bengtsson-Palme J, Ryberg M, Hartmann M, Branco S, Wang Z, Godhe A, De Wit P, Sanchez-Garcia M, Ebersberger I, de Sousa F, Amend AS, Jumpponen A, Unterseher M, Kristiansson E, Abarenkov K, Bertrand YJK, Sanli K, Eriksson KM, Vik U, Veldre V, Nilsson RH. (2013) Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data. Methods in Ecology and Evolution 4: 914–919. 10.1111/2041-210x.12073 DOI

Blankenberg D, Gordon A, Von Kuster G, Coraor N, Taylor J, Nekrutenko A, Team G. (2010) Manipulation of FASTQ data with Galaxy. Bioinformatics 26: 1783–1785. 10.1093/bioinformatics/btq281 PubMed DOI PMC

Bolger AM, Lohse M, Usadel B. (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30: 2114–2120. 10.1093/bioinformatics/btu170 PubMed DOI PMC

Brown SP, Veach AM, Rigdon-Huss AR, Grond K, Lickteig SK, Lothamer K, Oliver AK, Jumpponen A. (2015) Scraping the bottom of the barrel: are rare high throughput sequences artifacts? Fungal Ecology 13: 221–225. 10.1016/j.funeco.2014.08.006 DOI

Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP. (2016) DADA2: high-resolution sample inference from Illumina amplicon data. Nature Methods 13: 581. 10.1038/nmeth.3869 PubMed DOI PMC

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. (2009) BLAST+: architecture and applications. BMC Bioinformatics 10: 421. 10.1186/1471-2105-10-421 PubMed DOI PMC

Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Peña AG, Goodrich JK, Gordon JI. (2010) QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7: 335–336. 10.1038/nmeth.f.303 PubMed DOI PMC

Clarke K, Gorley R. (2006) PRIMER V6: User Manual / Tutorial. Primer-E Ltd, Plymouth, 192 pp.

Cline LC, Song Z, Al‐Ghalith GA, Knights D, Kennedy PG. (2017) Moving beyond de novo clustering in fungal community ecology. New Phytol. 216(3): 629–634. 10.1111/nph.14752 PubMed DOI

Deshpande V, Wang Q, Greenfield P, Charleston M, Porras-Alfaro A, Kuske CR, Cole JR, Midgley DJ, Tran-Dinh N. (2016) Fungal identification using a Bayesian classifier and the Warcup training set of internal transcribed spacer sequences. Mycologia 108: 1–5. 10.3852/14-293 PubMed DOI

Edgar RC. (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods 10. 10.1038/nmeth.2604 PubMed DOI

Edgar RC, Haas BJ, Clemente JC, Quince C, Knight R. (2011) UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27: 2194–2200. 10.1093/bioinformatics/btr381 PubMed DOI PMC

Frøslev TG, Kjøller R, Bruun HH, Ejrnæs R, Brunbjerg AK, Pietroni C, Hansen AJ. (2017) Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates. Nature communications 8: 1188. 10.1038/s41467-017-01312-x PubMed DOI PMC

Fu L, Niu B, Zhu Z, Wu S, Li W. (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28: 3150–3152. 10.1093/bioinformatics/bts565 PubMed DOI PMC

Grossart H-P, Wurzbacher C, James TY, Kagami M. (2016) Discovery of dark matter fungi in aquatic ecosystems demands a reappraisal of the phylogeny and ecology of zoosporic fungi. Fungal Ecology 19: 28–38. doi:10.1016/j.funeco.2015.06.004 DOI

Gweon HS, Oliver A, Taylor J, Booth T, Gibbs M, Read DS, Griffiths RI, Schonrogge K. (2015) PIPITS: an automated pipeline for analyses of fungal internal transcribed spacer sequences from the Illumina sequencing platform. Methods in Ecology and Evolution 6: 973–980. 10.1111/2041-210x.12399 PubMed DOI PMC

Hibbett D, Abarenkov K, Koljalg U, Opik M, Chai B, Cole JR, Wang Q, Crous PW, Robert VARG, Helgason T, Herr J, Kirk P, Lueschow S, O’Donnell K, Nilsson H, Oono R, Schoch CL, Smyth C, Walker D, Porras-Alfaro A, Taylor JW, Geiser DM. (2017) Sequence-based classification and identification of Fungi. Mycologia 108: 1049–1068 PubMed

Hildebrand F, Tadeo R, Voigt AY, Bork P, Raes J. (2014) LotuS: an efficient and user-friendly OTU processing pipeline. Microbiome 2: 30. 10.1186/2049-2618-2-30 PubMed DOI PMC

Lücking R, Kirk PM, Hawksworth DL. (2018) Sequence-based nomenclature: a reply to Thines et al. and Zamora et al. and provisions for an amended proposal. IMA fungus 9: 185–198. 10.5598/imafungus.2018.09.01.12 PubMed DOI PMC

Majaneva M, Hyytiäinen K, Varvio SL, Nagai S, Blomster J. (2015) Bioinformatic amplicon read processing strategies strongly affect eukaryotic diversity and the taxonomic composition of communities. PLoS ONE 10: e0130035. 10.1371/journal.pone.0130035 PubMed DOI PMC

R-Core-Team (2015) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna.

Rognes T, Flouri T, Nichols B, Quince C, Mahé F. (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ 4: e2584. 10.7717/peerj.2584 PubMed DOI PMC

Saary P, Forslund K, Bork P, Hildebrand F. (2017) RTK: efficient rarefaction analysis of large datasets. Bioinformatics 33: 2594–2595. 10.1093/bioinformatics/btx206 PubMed DOI PMC

Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, Lesniewski RA, Oakley BB, Parks DH, Robinson CJ, Sahl JW, Stres B, Thallinger GG, Van Horn DJ, Weber CF. (2009) Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities. Applied and Environmental Microbiology 75: 7537–7541. 10.1128/aem.01541-09 PubMed DOI PMC

Schoch CL, Seifert KA, Huhndorf S, Robert V, Spouge JL, Levesque CA, Chen W, Bolchacova E, Voigt K, Crous PW, Miller AN, Wingfield MJ, Aime MC, An KD, Bai FY, Barreto RW, Begerow D, Bergeron MJ, Blackwell M, Boekhout T, Bogale M, Boonyuen N, Burgaz AR, Buyck B, Cai L, Cai Q, Cardinali G, Chaverri P, Coppins BJ, Crespo A, Cubas P P, Cummings C, Damm U, de Beer ZW, de Hoog GS, Del-Prado R, Dentinger B, Dieguez-Uribeondo J, Divakar PK, Douglas B, Duenas M, Duong TA, Eberhardt U, Edwards JE, Elshahed MS, Fliegerova K, Furtado M, Garcia MA, Ge ZW, Griffith GW, Griffiths K, Groenewald JZ, Groenewald M, Grube M, Gryzenhout M, Guo LD, Hagen F, Hambleton S, Hamelin RC, Hansen K, Harrold P, Heller G, Herrera G, Hirayama K, Hirooka Y, Ho HM, Hoffmann K, Hofstetter V, Hognabba F, Hollingsworth PM, Hong SB, Hosaka K, Houbraken J, Hughes K, Huhtinen S, Hyde KD, James T, Johnson EM, Johnson JE, Johnston PR, Jones EB, Kelly LJ, Kirk PM, Knapp DG, Koljalg U, Kovacs GM, Kurtzman CP, Landvik S, Leavitt SD, Liggenstoffer AS, Liimatainen K, Lombard L, Luangsa-Ard JJ, Lumbsch HT, Maganti H, Maharachchikumbura SS, Martin MP, May TW, McTaggart AR, Methven AS, Meyer W, Moncalvo JM, Mongkolsamrit S, Nagy LG, Nilsson RH, Niskanen T, Nyilasi I, Okada G, Okane I, Olariaga I, Otte J, Papp T, Park D, Petkovits T, Pino-Bodas R, Quaedvlieg W, Raja HA, Redecker D, Rintoul T, Ruibal C, Sarmiento-Ramirez JM, Schmitt I, Schussler A, Shearer C, Sotome K, Stefani FO, Stenroos S, Stielow B, Stockinger H, Suetrong S, Suh SO, Sung GH, Suzuki M, Tanaka K, Tedersoo L, Telleria MT, Tretter E, Untereiner WA, Urbina H, Vagvolgyi C, Vialle A, Vu TD, Walther G, Wang QM, Wang Y, Weir BS, Weiss M, White MM, Xu J, Yahr R, Yang ZL, Yurkov A, Zamora JC, Zhang N, Zhuang WY, Schindel D, Fungal Barcoding C. (2012) Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. Proceedings of the National Academy of Sciences of the United States of America 109: 6241–6246. 10.1073/pnas.1117018109 PubMed DOI PMC

Sinha R, Abu-Ali G, Vogtmann E, Fodor AA, Ren B, Amir A, Schwager E, Crabtree J, Ma S, Abnet CC. (2017) Assessment of variation in microbial community amplicon sequencing by the Microbiome Quality Control (MBQC) project consortium. Nature Biotechnology volume 35, pages 1077–1086. 10.1038/nbt.3981 PubMed DOI PMC

Wang Q, Garrity GM, Tiedje JM, Cole JR. (2007) Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental Microbiology 73. 10.1128/aem.00062-07 PubMed DOI PMC

Vetrovský T, Baldrian P, Morais D, Berger B. (2018) SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses. Bioinformatics 1: 3. 10.1093/bioinformatics/bty071 PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...