• This record comes from PubMed

A pile of pipelines: An overview of the bioinformatics software for metabarcoding data analyses

. 2024 Jul ; 24 (5) : e13847. [epub] 20230807

Language English Country Great Britain, England Media print-electronic

Document type Journal Article, Review

Grant support
P20 GM103449 NIGMS NIH HHS - United States
EXC2124 Deutsche Forschungsgemeinschaft (DFG)
MOBTP198 European Regional Development Fund and the programme Mobilitas Pluss
Genome Canada and Ontario Genomics
U24 CA248454 NCI NIH HHS - United States
P20GM103449 NIGMS NIH HHS - United States
21-17749S Grantová Agentura České Republiky

Environmental DNA (eDNA) metabarcoding has gained growing attention as a strategy for monitoring biodiversity in ecology. However, taxa identifications produced through metabarcoding require sophisticated processing of high-throughput sequencing data from taxonomically informative DNA barcodes. Various sets of universal and taxon-specific primers have been developed, extending the usability of metabarcoding across archaea, bacteria and eukaryotes. Accordingly, a multitude of metabarcoding data analysis tools and pipelines have also been developed. Often, several developed workflows are designed to process the same amplicon sequencing data, making it somewhat puzzling to choose one among the plethora of existing pipelines. However, each pipeline has its own specific philosophy, strengths and limitations, which should be considered depending on the aims of any specific study, as well as the bioinformatics expertise of the user. In this review, we outline the input data requirements, supported operating systems and particular attributes of thirty-two amplicon processing pipelines with the goal of helping users to select a pipeline for their metabarcoding projects.

Agroécologie INRAE Institut Agro Univ Bourgogne Franche Comté Dijon France

Aquatic Ecosystem Research University of Duisburg Essen Essen Germany

Center for Applied Microbiome Science Pathogen and Microbiome Institute Northern Arizona University Flagstaff Arizona USA

Center for Forest Mycology Research Northern Research Station US Forest Service Madison Wisconsin USA

Department of Biological and Environmental Science University of Jyväskylä Jyväskylä Finland

Department of Forest Mycology and Plant Pathology Swedish University of Agricultural Sciences Uppsala Sweden

Department of Integrative Biology and Centre for Biodiversity Genomics University of Guelph Guelph Ontario Canada

Department of Marine Microbiology and Biogeochemistry NIOZ Royal Netherlands Institute for Sea Research Texel Netherlands

Department of Population Health and Pathobiology College of Veterinary Medicine and Bioinformatics Research Center North Carolina State University Raleigh North Carolina USA

Earlham Institute Norwich Research Park Norfolk UK

GenPhySE Université de Toulouse INRAE ENVT Castanet Tolosan France

Gut Microbes and Health Quadram Institute Bioscience Norfolk UK

INRAE AgroParisTech GABI Université Paris Saclay Jouy en Josas France

INRAE SIGENAE Jouy en Josas France

Institut de Biologie Intégrative et des Systèmes Université Laval Québec Québec Canada

Institute of Ecology and Earth Sciences University of Tartu Tartu Estonia

KU Leuven Department of Microbiology Immunology and Transplantation Rega Institute for Medical Research Laboratory of Molecular Bacteriology Leuven Belgium

Laboratory of Environmental Microbiology Institute of Microbiology of the Czech Academy of Sciences Praha Czech Republic

Quantitative Biology Center University of Tübingen Tübingen Germany

School of Biological Sciences University of Reading Reading UK

UK Centre for Ecology and Hydrology Oxfordshire UK

Unit of Computational Biology Research and Innovation Centre Fondazione Edmund Mach Italy

Vermont Biomedical Research Network University of Vermont Burlington Vermont USA

Zachary Gold NOAA Pacific Marine Environmental Laboratory Seattle Washington USA

See more in PubMed

Albanese D, Fontana P, De Filippo C, Cavalieri D, & Donati C (2015). MICCA: a complete and accurate software for taxonomic profiling of metagenomic data. Scientific Reports, 5(1), 1–7. 10.1038/srep09743 PubMed DOI PMC

Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, & Lipman DJ (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research, 25(17), 3389–3402. 10.1093/nar/25.17.3389 PubMed DOI PMC

Andújar C, Creedy TJ, Arribas P, López H, Salces-Castellano A, Pérez-Delgado AJ, Vogler AP, & Emerson BC (2021). Validated removal of nuclear pseudogenes and sequencing artefacts from mitochondrial metabarcode data. Molecular Ecology Resources, 21(6), 1772–1787. 10.1111/1755-0998.13337 PubMed DOI

Anslan S, & Tedersoo L (2015). Performance of cytochrome c oxidase subunit I (COI), ribosomal DNA Large Subunit (LSU) and Internal Transcribed Spacer 2 (ITS2) in DNA barcoding of Collembola. European Journal of Soil Biology, 69, 1–7. 10.1016/j.ejsobi.2015.04.001 DOI

Anslan S, Bahram M, Hiiesalu I, & Tedersoo L (2017). PipeCraft: Flexible open-source toolkit for bioinformatics analysis of custom high-throughput amplicon sequencing data. Molecular Ecology Resources, 17(6), e234–e240. 10.1111/1755-0998.12692 PubMed DOI

Anslan S, Mikryukov V, Armolaitis K, Ankuda J, Lazdina D, Makovskis K, … & Tedersoo L (2021). Highly comparable metabarcoding results from MGI-Tech and Illumina sequencing platforms. PeerJ, 9, e12254. 10.7717/peerj.12254 PubMed DOI PMC

Anslan S, Nilsson RH, Wurzbacher C, Baldrian P, Tedersoo Leho, & Bahram M (2018). Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding. MycoKeys, (39), 29–40. 10.3897/mycokeys.39.28109 PubMed DOI PMC

Ansorge R, Birolo G, James SA, & Telatin A (2021). Dadaist2: a toolkit to automate and simplify statistical analysis and plotting of metabarcoding experiments. International journal of molecular sciences, 22(10), 5309. 10.3390/ijms22105309 PubMed DOI PMC

Antich A, Palacin C, Wangensteen OS, & Turon X (2021). To denoise or to cluster, that is not the question: Optimizing pipelines for COI metabarcoding and metaphylogeography. BMC Bioinformatics, 22(1), 177. 10.1186/s12859-021-04115-6. PubMed DOI PMC

Antich A, Palacín C, Turon X, & Wangensteen OS (2022). DnoisE: distance denoising by entropy. An open-source parallelizable alternative for denoising sequence datasets. PeerJ, 10, e12758. 10.7717/peerj.12758 PubMed DOI PMC

Asbun AA, Besseling MA, Balzano S, van Bleijswijk JDL, Witte HJ, Villanueva L, & Engelmann JC (2020). Cascabel: A Scalable and Versatile Amplicon Sequence Data Analysis Pipeline Delivering Reproducible and Documented Results. In Frontiers in Genetics (Vol. 11). 10.3389/fgene.2020.489357 PubMed DOI PMC

Bai J, Jhaney I, & Wells J (2019). Developing a reproducible microbiome data analysis pipeline using the Amazon web services cloud for a cancer research group: proof-of-concept study. JMIR medical informatics, 7(4), e14667. 10.2196/14667 PubMed DOI PMC

Bailet B, Apothéloz-Perret-Gentil L, Baričević A, Chonova T, Franc A, Frigerio JM, … & Kahlert M (2020). Diatom DNA metabarcoding for ecological assessment: Comparison among bioinformatics pipelines used in six European countries reveals the need for standardization. Science of the Total Environment, 745, 140948. 10.1016/j.scitotenv.2020.140948 PubMed DOI

Baloğlu B, Chen Z, Elbrecht V, Braukmann T, MacDonald S, & Steinke D (2021). A workflow for accurate metabarcoding using nanopore MinION sequencing. Methods in Ecology and Evolution, 12(5), 794–804. 10.1111/2041-210X.13561 DOI

Baltrušis P, Halvarsson P, & Höglund J (2022). Estimation of the impact of three different bioinformatic pipelines on sheep nemabiome analysis. Parasites & Vectors, 15(1), 1–12. 10.1186/s13071-022-05399-0 PubMed DOI PMC

Banchi E, Ametrano CG, Greco S, Stanković D, Muggia L, & Pallavicini A (2020). PLANiTS: A curated sequence reference dataset for plant ITS DNA metabarcoding. Database, 2020(baz155). 10.1093/database/baz155 PubMed DOI PMC

Ben-David T, Melamed S, Gerson U, & Morin S (2007). ITS2 sequences as barcodes for identifying and analyzing spider mites (Acari: Tetranychidae). Experimental and Applied Acarology, 41(3), 169–181. 10.1186/s13071-022-05399-0 PubMed DOI

Bengtsson-Palme J, Ryberg M, Hartmann M, Branco S, Wang Z, Godhe A, … & Nilsson RH (2013). Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data. Methods Ecol Evol. 2013; 4 (10): 914–9. 10.1111/2041-210X.12073 DOI

Bernard M, Rué O, Mariadassou M, & Pascal G (2021). FROGS: a powerful tool to analyse the diversity of fungi with special management of internal transcribed spacers. Briefings in Bioinformatics, 22(6). 10.1093/bib/bbab318 PubMed DOI

Bokulich NA, Kaehler BD, Rideout JR, Dillon M, Bolyen E, Knight R, … & Gregory Caporaso J (2018). Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2’s q2-feature-classifier plugin. Microbiome, 6(1), 1–17. 10.1186/s40168-018-0470-z PubMed DOI PMC

Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet CC, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, … Caporaso JG (2019). Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology, 37(8), 852–857. 10.1038/s41587-019-0209-9 PubMed DOI PMC

Boyer F, Mercier C, Bonin A, Le Bras Y, Taberlet P, & Coissac E (2016). obitools: a unix-inspired software package for DNA metabarcoding. Molecular Ecology Resources, 16(1), 176–182. 10.1111/1755-0998.12428 PubMed DOI

Brandt MI, Trouche B, Quintric L, Günther B, Wincker P, Poulain J, & Arnaud‐Haond S (2021). Bioinformatic pipelines combining denoising and clustering tools allow for more comprehensive prokaryotic and eukaryotic metabarcoding. Molecular Ecology Resources, 21(6), 1904–1921. 10.1111/1755-0998.13398 PubMed DOI

Brown SP, Veach AM, Rigdon-Huss AR, Grond K, Lickteig SK, Lothamer K, … & Jumpponen A (2015). Scraping the bottom of the barrel: are rare high throughput sequences artifacts?. fungal ecology, 13, 221–225.

Bruce K, Blackman RC, Bourlat SJ, Hellström M, Bakker J, Bista I, … & Deiner K (2021). A practical guide to DNA-based methods for biodiversity assessment. 10.3897/ab.e68634 DOI

Buchner D, Macher T-H, & Leese F (2022). APSCALE: advanced pipeline for simple yet comprehensive analyses of DNA metabarcoding data. Bioinformatics , 38(20), 4817–4819. 10.1093/bioinformatics/btac588 PubMed DOI PMC

Callahan BJ, Grinevich D, Thakur S, Balamotis MA, & Yehezkel TB (2021). Ultra-accurate microbial amplicon sequencing with synthetic long reads. Microbiome, 9(1), 130. 10.1186/s40168-021-01072-3 PubMed DOI PMC

Callahan BJ, McMurdie PJ, & Holmes SP (2017). Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. The ISME journal, 11(12), 2639–2643. 10.1038/ismej.2017.119 PubMed DOI PMC

Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, & Holmes SP (2016). DADA2: High-resolution sample inference from Illumina amplicon data. Nature Methods, 13(7), 581–583. 10.1038/nmeth.3869 PubMed DOI PMC

Callahan BJ, Wong J, Heiner C, Oh S, Theriot CM, Gulati AS, … & Dougherty MK (2019). High-throughput amplicon sequencing of the full-length 16S rRNA gene with single-nucleotide resolution. Nucleic acids research, 47(18), e103. 10.1093/nar/gkz569 PubMed DOI PMC

Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, … & Knight R (2010). QIIME allows analysis of high-throughput community sequencing data. Nature methods, 7(5), 335–336. 10.1038/nmeth.f.303 PubMed DOI PMC

Carlsen T, Aas AB, Lindner D, Vrålstad T, Schumacher T, & Kauserud H (2012). Don’t make a mista (g) ke: is tag switching an overlooked source of error in amplicon pyrosequencing studies?. Fungal Ecology, 5(6), 747–749.

Carøe C, & Bohmann K (2020). Tagsteady: a metabarcoding library preparation protocol to avoid false assignment of sequences to samples. Molecular Ecology Resources, 20(6), 1620–1631. 10.1111/1755-0998.13227 PubMed DOI

Castaño C, Berlin A, Brandström Durling M, Ihrmark K, Lindahl BD, Stenlid J, … & Olson Å (2020). Optimized metabarcoding with Pacific Biosciences enables semi‐quantitative analysis of fungal communities. New Phytologist, 228(3). 10.1111/nph.16731 PubMed DOI

CBOL Plant Working Group 1, Hollingsworth PM, Forrest LL, Spouge JL, Hajibabaei M, Ratnasingham S, … & Little DP (2009). A DNA barcode for land plants. Proceedings of the National Academy of Sciences, 106(31), 12794–12797. 10.1073/pnas.0905845106 PubMed DOI PMC

Chen S, Zhou Y, Chen Y, & Gu J (2018). fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics, 34(17), i884–i890. 10.1093/bioinformatics/bty560 PubMed DOI PMC

Community, G. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update. Nucleic acids research, 50(W1), W345–W351. 10.1093/nar/gkac247 PubMed DOI PMC

Compson ZG, McClenaghan B, Singer GA, Fahner NA, & Hajibabaei M (2020). Metabarcoding from microbes to mammals: comprehensive bioassessment on a global scale. Frontiers in Ecology and Evolution, 8, 581835. 10.3389/fevo.2020.581835 DOI

Copeland M, Soh J, Puca A, Manning M, & Gollob D (2015). Microsoft azure. New York, NY, USA:: Apress, 3–26.

Couton M, Baud A, Daguin‐Thiébaut C, Corre E, Comtet T, & Viard F (2021). High‐throughput sequencing on preservative ethanol is effective at jointly examining infraspecific and taxonomic diversity, although bioinformatics pipelines do not perform equally. Ecology and evolution, 11(10), 5533–5546. 10.1002/ece3.7453 PubMed DOI PMC

Creedy TJ, Andujar C, Meramveliotakis E, Noguerales V, Overcast I, Papadopoulou A, … & Arribas P (2022). Coming of age for COI metabarcoding of whole organism community DNA: towards bioinformatic harmonisation. Molecular Ecology Resources, 22(3), 847–861. 10.1111/1755-0998.13502 PubMed DOI PMC

Curd EE, Gold Z, Kandlikar GS, Gomer J, Ogden M, O’Connell T, … & Meyer RS (2019). Anacapa Toolkit: An environmental DNA toolkit for processing multilocus metabarcode datasets. Methods in Ecology and Evolution, 10(9), 1469–1475. 10.1111/2041-210X.13214 DOI

De Santiago A, Pereira TJ, Mincks SL, & Bik HM (2022). Dataset complexity impacts both MOTU delimitation and biodiversity estimates in eukaryotic 18S rRNA metabarcoding studies. Environmental DNA, 4(2), 363–384. 10.1002/edn3.255 DOI

Di Tommaso P, Chatzou M, Floden EW, Barja PP, Palumbo E, & Notredame C (2017). Nextflow enables reproducible computational workflows. Nature Biotechnology, 35(4), 316–319. 10.1038/nbt.3820 PubMed DOI

Djemiel C, Dequiedt S, Karimi B, Cottin A, Girier T, El Djoudi Y, Wincker P, Lelièvre M, Mondy S, Chemidlin Prévost-Bouré N, Maron P-A, Ranjard L, & Terrat S (2020). BIOCOM-PIPE: a new user-friendly metabarcoding pipeline for the characterization of microbial diversity from 16S, 18S and 23S rRNA gene amplicons. BMC Bioinformatics, 21(1), 492. 10.1186/s12859-020-03829-3 PubMed DOI PMC

Djemiel C, Plassard D, Terrat S, Crouzet O, Sauze J, Mondy S, … & Maron PA (2020). μgreen-db: a reference database for the 23S rRNA gene of eukaryotic plastids and cyanobacteria. Scientific reports, 10(1), 1–11. 10.1038/s41598-020-62555-1 PubMed DOI PMC

Durling MB, Clemmensen KE, Stenlid J, & Lindahl B (2011). SCATA-An efficient bioinformatic pipeline for species identification and quantification after high-throughput sequencing of tagged amplicons.

Edgar RC (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26(19), 2460–2461. 10.1093/bioinformatics/btq461 PubMed DOI

Edgar RC (2016). SINTAX: a simple non-Bayesian taxonomy classifier for 16S and ITS sequences. biorxiv, 074161. 10.1101/074161 DOI

Edgar RC (2016). UNOISE2: improved error-correction for Illumina 16S and ITS amplicon sequencing. BioRxiv, 081257. 10.1101/081257 DOI

Edgar RC (2017). Accuracy of microbial community diversity estimated by closed-and open-reference OTUs. PeerJ, 5, e3889. 10.7717/peerj.3889 PubMed DOI PMC

Edgar RC (2018). Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences. PeerJ, 6, e4652. 10.7717/peerj.4652 PubMed DOI PMC

Edgar RC (2018). UNCROSS2: identification of cross-talk in 16S rRNA OTU tables. BioRxiv, 400762. 10.1101/400762 DOI

Edgar RC, & Flyvbjerg H (2015). Error filtering, pair assembly and error correction for next-generation sequencing reads. Bioinformatics, 31(21), 3476–3482. 10.1093/bioinformatics/btv401 PubMed DOI

Edgar RC, Haas BJ, Clemente JC, Quince C, & Knight R (2011). UCHIME improves sensitivity and speed of chimera detection. Bioinformatics, 27(16), 2194–2200. 10.1093/bioinformatics/btr381 PubMed DOI PMC

Elbrecht V, Taberlet P, Dejean T, Valentini A, Usseglio-Polatera P, Beisel JN, … & Leese F (2016). Testing the potential of a ribosomal 16S marker for DNA metabarcoding of insects. PeerJ, 4, e1966. 10.7717/peerj.1966 PubMed DOI PMC

Escudié F, Auer L, Bernard M, Mariadassou M, Cauquil L, Vidal K, Maman S, Hernandez-Raquet G, Combes S, Pascal G. FROGS: Find, Rapidly, OTUs with Galaxy Solution. Bioinformatics. 2018. Apr 15;34(8):1287–1294. 10.1093/bioinformatics/btx791 PubMed DOI

Frøslev TG, Kjøller R, Bruun HH, Ejrnæs R, Brunbjerg AK, Pietroni C, & Hansen AJ (2017). Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates. Nature communications, 8(1), 1–11. 10.1038/s41467-017-01312-x PubMed DOI PMC

Furneaux B, Bahram M, Rosling A, Yorou NS, & Ryberg M (2021). Long‐and short‐read metabarcoding technologies reveal similar spatiotemporal structures in fungal communities. Molecular Ecology Resources, 21(6), 1833–1849. 10.1111/1755-0998.13387 PubMed DOI

Glassman SI, & Martiny JB (2018). Broadscale ecological patterns are robust to use of exact sequence variants versus operational taxonomic units. MSphere, 3(4), e00148–18. 10.1128/mSphere.00148-18 PubMed DOI PMC

Gold Z, Curd EE, Goodwin KD, Choi ES, Frable BW, Thompson AR, … & Barber PH (2021). Improving metabarcoding taxonomic assignment: A case study of fishes in a large marine ecosystem. Molecular ecology resources, 21(7), 2546–2564. 10.1111/1755-0998.13450 PubMed DOI

González A, Dubut V, Corse E, Mekdad R, Dechatre T, Castet U, … & Meglécz E (2023). VTAM: A robust pipeline for validating metabarcoding data using controls. Computational and Structural Biotechnology Journal. 10.1016/j.csbj.2023.01.034 PubMed DOI PMC

Gweon HS, Oliver A, Taylor J, Booth T, Gibbs M, Read DS, Griffiths RI, & Schonrogge K (2015). PIPITS: an automated pipeline for analyses of fungal internal transcribed spacer sequences from the Illumina sequencing platform. Methods in Ecology and Evolution / British Ecological Society, 6(8), 973–980. 10.1111/2041-210X.12399 PubMed DOI PMC

Hajibabaei M, Shokralla S, Zhou X, Singer GA, & Baird DJ (2011). Environmental barcoding: a next-generation sequencing approach for biomonitoring applications using river benthos. PLoS one, 6(4), e17497. 10.1371/journal.pone.0017497 PubMed DOI PMC

Harrison JP, Chronopoulou PM, Salonen IS, Jilbert T, & Koho KA (2021). 16S and 18S rRNA gene metabarcoding provide congruent information on the responses of sediment communities to eutrophication. Frontiers in Marine Science, 8, 708716. 10.3389/fmars.2021.708716 DOI

Hebert Paul DN, et al. “Biological identifications through DNA barcodes.” Proceedings of the Royal Society of London. Series B: Biological Sciences 270.1512 (2003): 313–321. 10.1098/rspb.2002.2218 PubMed DOI PMC

Heeger F, Bourne EC, Baschien C, Yurkov A, Bunk B, Spröer C, … & Monaghan MT (2018). Long‐read DNA metabarcoding of ribosomal RNA in the analysis of fungi from aquatic environments. Molecular Ecology Resources, 18(6), 1500–1514. 10.1111/1755-0998.12937 PubMed DOI

Hildebrand F, Tadeo R, Voigt AY, Bork P, & Raes J (2014). LotuS: an efficient and user-friendly OTU processing pipeline. Microbiome, 2(1), 1–7. 10.1186/2049-2618-2-30 PubMed DOI PMC

Hleap JS, Littlefair JE, Steinke D, Hebert PD, & Cristescu ME (2021). Assessment of current taxonomic assignment strategies for metabarcoding eukaryotes. Molecular Ecology Resources, 21(7), 2190–2203. 10.1111/1755-0998.13407 PubMed DOI

Hupfauf S, Etemadi M, Juárez MF-D, Gómez-Brandón M, Insam H, & Podmirseg SM (2020). CoMA – an intuitive and user-friendly pipeline for amplicon-sequencing data analysis. In PLOS ONE (Vol. 15, Issue 12, p. e0243241). 10.1371/journal.pone.0243241 PubMed DOI PMC

Huse SM, Welch DM, Morrison HG, & Sogin ML (2010). Ironing out the wrinkles in the rare biosphere through improved OTU clustering. Environmental microbiology, 12(7), 1889–1898. 10.1111/j.1462-2920.2010.02193.x PubMed DOI PMC

Huson DH, Auch AF, Qi J, & Schuster SC (2007). MEGAN analysis of metagenomic data. Genome Research, 17(3), 377–386. 10.1101/gr.5969107 PubMed DOI PMC

Hussain A, & Aleem M (2018). GoCJ: Google cloud jobs dataset for distributed and cloud computing infrastructures. Data, 3(4), 38. 10.3390/data3040038 DOI

Kaehler BD, Bokulich NA, McDonald D, Knight R, Caporaso JG, & Huttley GA (2019). Species abundance information improves sequence taxonomy classification accuracy. Nature communications, 10(1), 4643. 10.1038/s41467-019-12669-6 PubMed DOI PMC

Kang W, Anslan S, Börner N, Schwarz A, Schmidt R, Künzel S, … & Schwalb A (2021). Diatom metabarcoding and microscopic analyses from sediment samples at Lake Nam Co, Tibet: The effect of sample-size and bioinformatics on the identified communities. Ecological Indicators, 121, 107070. 10.1016/j.ecolind.2020.107070 DOI

Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C, Debelius J, Gonzalez A, Kosciolek T, McCall L-I, McDonald D, Melnik AV, Morton JT, Navas J, Quinn RA, Sanders JG, Swafford AD, Thompson LR, Tripathi A, Xu ZZ, … Dorrestein PC (2018). Best practices for analysing microbiomes. Nature Reviews. Microbiology, 16(7), 410–422. 10.1038/s41579-018-0029-9 PubMed DOI

Koster J, & Rahmann S (2012). Snakemake--a scalable bioinformatics workflow engine. In Bioinformatics (Vol. 28, Issue 19, pp. 2520–2522). 10.1093/bioinformatics/bts480 PubMed DOI

Kurtzer GM, Sochat V, & Bauer MW (2017). Singularity: Scientific containers for mobility of compute. PloS one, 12(5), e0177459. 10.1371/journal.pone.0177459 PubMed DOI PMC

Laehnemann D, Borkhardt A, & McHardy AC (2016). Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction. Briefings in bioinformatics, 17(1), 154–179. 10.1093/bib/bbv029 PubMed DOI PMC

Lear G, Dickie I, Banks J, Boyer S, Buckley HL, Buckley TR, … & Holdaway R (2018). Methods for the extraction, storage, amplification and sequencing of DNA from environmental samples. New Zealand Journal of Ecology, 42(1), 10–50A. 10.20417/nzjecol.42.9 DOI

Lindgreen S (2012). AdapterRemoval: easy cleaning of next-generation sequencing reads. BMC research notes, 5(1), 1–7. 10.1186/1756-0500-5-337 PubMed DOI PMC

Liu J, & Zhang H (2021). Combining multiple markers in environmental DNA metabarcoding to assess deep-sea benthic biodiversity. Frontiers in Marine Science, 8, 684955. 10.3389/fmars.2021.684955 DOI

Loos D, Zhang L, Beemelmanns C, Kurzai O, & Panagiotou G (2021). DAnIEL: A User-Friendly Web Server for Fungal ITS Amplicon Sequencing Data. Frontiers in Microbiology, 12, 720513. 10.3389/fmicb.2021.720513 PubMed DOI PMC

Mahé F, Czech L, Stamatakis A, Quince C, de Vargas C, Dunthorn M, & Rognes T (2022). Swarm v3: towards tera-scale amplicon clustering. Bioinformatics, 38(1), 267–269. 10.1093/bioinformatics/btab493 PubMed DOI PMC

Marquina D, Esparza‐Salas R, Roslin T, & Ronquist F (2019). Establishing arthropod community composition using metabarcoding: Surprising inconsistencies between soil samples and preservative ethanol and homogenate from Malaise trap catches. Molecular ecology resources, 19(6), 1516–1530. 10.1111/1755-0998.13071 PubMed DOI PMC

Martin M (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal, 17(1), 10–12. 10.14806/ej.17.1.200 DOI

Mathon L, Valentini A, Guérin PE, Normandeau E, Noel C, Lionnet C, … & Manel S (2021). Benchmarking bioinformatic tools for fast and accurate eDNA metabarcoding species identification. Molecular Ecology Resources, 21(7), 2565–2579. 10.1111/1755-0998.13430 PubMed DOI

McGee KM, Robinson CV, & Hajibabaei M (2019). Gaps in DNA-based biomonitoring across the globe. Frontiers in Ecology and Evolution, 7, 337. 10.3389/fevo.2019.00337 DOI

Mikryukov V, Anslan S, Tedersoo L NextITS: a pipeline for metabarcoding fungi and other eukaryotes with full-length ITS sequenced with PacBio. https://github.com/vmikk/NextITS

Minerovic AD, Potapova MG, Sales CM, Price JR, & Enache MD (2020). 18S-V9 DNA metabarcoding detects the effect of water-quality impairment on stream biofilm eukaryotic assemblages. Ecological Indicators, 113, 106225. 10.1016/j.ecolind.2020.106225 DOI

Miya M, Gotoh RO, & Sado T (2020). MiFish metabarcoding: a high-throughput approach for simultaneous detection of multiple fish species from environmental DNA and other samples. Fisheries Science, 86(6), 939–970. 10.1007/s12562-020-01461-x DOI

Mölder F, Jablonski KP, Letcher B, Hall MB, Tomkins-Tinch CH, Sochat V, Forster J, Lee S, Twardziok SO, Kanitz A, Wilm A, Holtgrewe M, Rahmann S, Nahnsen S, & Köster J (2021). Sustainable data analysis with Snakemake. F1000Research, 10, 33. 10.12688/f1000research.29032.2 PubMed DOI PMC

Mousavi‐Derazmahalleh M, Stott A, Lines R, Peverley G, Nester G, Simpson T, … & Christophersen CT (2021). eDNAFlow, an automated, reproducible and scalable workflow for analysis of environmental DNA sequences exploiting Nextflow and Singularity. Molecular Ecology Resources, 21(5), 1697–1704. 10.1111/1755-0998.13356 PubMed DOI

Nearing JT, Douglas GM, Comeau AM, & Langille MG (2018). Denoising the Denoisers: an independent evaluation of microbiome sequence error-correction approaches. PeerJ, 6, e5364. 10.7717/peerj.5364 PubMed DOI PMC

Nilsson RH, Anslan S, Bahram M, Wurzbacher C, Baldrian P, & Tedersoo L (2019). Mycobiome diversity: high-throughput sequencing and identification of fungi. Nature Reviews. Microbiology, 17(2), 95–109. 10.1038/s41579-018-0116-y PubMed DOI

Nilsson RH, Wurzbacher C, Bahram M, Coimbra VR, Larsson E, Tedersoo L, … & Abarenkov K (2016). Top 50 most wanted fungi. MycoKeys, (12), 29–40. 10.3897/mycokeys.12.7553 DOI

Özkurt E, Fritscher J, Soranzo N, Ng DYK, Davey RP, Bahram M, & Hildebrand F (2022). LotuS2: an ultrafast and highly accurate tool for amplicon sequencing analysis. Microbiome, 10(1), 176. 10.1186/s40168-022-01365-1 PubMed DOI PMC

Palmer JM, Jusino MA, Banik MT, & Lindner DL (2018). Non-biological synthetic spike-in controls and the AMPtk software pipeline improve mycobiome data. PeerJ, 6, e4925. 10.7717/peerj.4925 PubMed DOI PMC

Pauvert C, Buee M, Laval V, Edel-Hermann V, Fauchery L, Gautier A, … & Vacher C (2019). Bioinformatics matters: The accuracy of plant and soil fungal community data is highly dependent on the metabarcoding pipeline. Fungal Ecology, 41, 23–33. 10.1016/j.funeco.2019.03.005 DOI

Plummer E, Twin J, Bulach DM, Garland SM, & Tabrizi SN (2015). A comparison of three bioinformatics pipelines for the analysis of preterm gut microbiota using 16S rRNA gene sequencing data. Journal of Proteomics & Bioinformatics, 8(12), 283–291. 10.3389/fmicb.2020.01262 DOI

Pollock J, Glendinning L, Wisedchanwet T, & Watson M (2018). The Madness of Microbiome: Attempting To Find Consensus “Best Practice” for 16S Microbiome Studies. Applied and Environmental Microbiology, 84(7). 10.1128/AEM.02627-17 PubMed DOI PMC

Porter TM, & Hajibabaei M (2018). Automated high throughput animal CO1 metabarcode classification. Scientific Reports, 8(1), 4226. 10.1038/s41598-018-22505-4 PubMed DOI PMC

Porter TM, & Hajibabaei M (2020). Putting COI metabarcoding in context: The utility of exact sequence variants (ESVs) in biodiversity analysis. Frontiers in Ecology and Evolution, 8, 248. 10.3389/fevo.2020.00248 DOI

Porter TM, & Hajibabaei M (2021). Profile hidden Markov model sequence analysis can help remove putative pseudogenes from DNA barcoding and metabarcoding datasets. BMC bioinformatics, 22(1), 1–20. 10.1186/s12859-021-04180-x PubMed DOI PMC

Porter TM, & Hajibabaei M (2022). MetaWorks: A flexible, scalable bioinformatic pipeline for high-throughput multi-marker biodiversity assessments. PloS One, 17(9), e0274260. 10.1371/journal.pone.0274260 PubMed DOI PMC

Prodan A, Tremaroli V, Brolin H, Zwinderman AH, Nieuwdorp M, & Levin E (2020). Comparing bioinformatic pipelines for microbial 16S rRNA amplicon sequencing. PLoS One, 15(1), e0227434. 10.1371/journal.pone.0227434 PubMed DOI PMC

Ratnasingham S, & Hebert PD (2007). BOLD: The Barcode of Life Data System (http://www.barcodinglife.org). Molecular ecology notes, 7(3), 355–364. 10.1111/j.1471-8286.2007.01678.x PubMed DOI PMC

Reeder J, & Knight R (2009). The’rare biosphere’: a reality check. Nature methods, 6(9), 636–637. 10.1038/nmeth0909-636 PubMed DOI

Reitmeier S, Hitch TC, Treichel N, Fikas N, Hausmann B, Ramer-Tait AE, … & Clavel T (2021). Handling of spurious sequences affects the outcome of high-throughput 16S rRNA gene amplicon profiling. ISME Communications, 1(1), 1–12. 10.1038/s43705-021-00033-z PubMed DOI PMC

Richardson RT, Bengtsson‐Palme J, & Johnson RM (2017). Evaluating and optimizing the performance of software commonly used for the taxonomic classification of DNA metabarcoding sequence data. Molecular Ecology Resources, 17(4), 760–769. 10.1111/1755-0998.12628 PubMed DOI

Rimet F, Gusev E, Kahlert M, Kelly MG, Kulikovskiy M, Maltsev Y, … & Bouchez A (2019). Diat. barcode, an open-access curated barcode library for diatoms. Scientific Reports, 9(1), 15116. 10.1038/s41598-019-51500-6 PubMed DOI PMC

Rivers AR, Weber KC, Gardner TG, Liu S, & Armstrong SD (2018). ITSxpress: Software to rapidly trim internally transcribed spacer sequences with quality scores for marker gene analysis. F1000Research, 7. 10.12688/f1000research.15704.1 PubMed DOI PMC

Rodriguez‐Martinez S, Klaminder J, Morlock MA, Dalén L, & Huang DT (2022). The topological nature of tag jumping in environmental DNA metabarcoding studies. Molecular Ecology Resources. 10.1111/1755-0998.13745 PubMed DOI

Rognes T, Flouri T, Nichols B, Quince C, & Mahé F (2016). VSEARCH: a versatile open source tool for metagenomics. PeerJ, 4, e2584. 10.7717/peerj.2584 PubMed DOI PMC

Rosen GL, Reichenberger ER, & Rosenfeld AM (2011). NBC: the Naive Bayes Classification tool webserver for taxonomic classification of metagenomic reads. Bioinformatics, 27(1), 127–129. 10.1093/bioinformatics/btq619 PubMed DOI PMC

Sato M, Sugaya N, Murakami H, Imaizumi A, Aburatani S, Akutsu T, & Horimoto K (2004). Remote homolog detection by match-node profile in hidden Markov model. In Callaos N, Horimoto K, Chen J, & Chan AKS (Eds.), 8th World Multi-Conference on Systemics, Cybernetics and Informatics, Vol Vii, Proceedings: Applications of Informatics and Cybernetics in Science and Engineering (pp. 27–34). Int Inst Informatics & Systemics. http://www.webofscience.com/wos/alldb/full-record/WOS:000227682900005

Schloss PD, Westcott SL, Ryabin T, Hall JR, Hartmann M, Hollister EB, … & Weber CF (2009). Introducing mothur: open-source, platform-independent, community-supported software for describing and comparing microbial communities. Applied and environmental microbiology, 75(23), 7537–7541. 10.1128/AEM.01541-09 PubMed DOI PMC

Schnell IB, Bohmann K, & Gilbert MTP (2015). Tag jumps illuminated–reducing sequence‐to‐sample misidentifications in metabarcoding studies. Molecular ecology resources, 15(6), 1289–1303. 10.1111/1755-0998.12402 PubMed DOI

Singer GAC, Fahner NA, Barnes JG, McCarthy A, & Hajibabaei M (2019). Comprehensive biodiversity analysis via ultra-deep patterned flow cell technology: a case study of eDNA metabarcoding seawater. Scientific reports, 9(1), 5991. 10.1038/s41598-019-42455-9 PubMed DOI PMC

Song H, Buhay JE, Whiting MF, & Crandall KA (2008). Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified. Proceedings of the National Academy of Sciences of the United States of America, 105(36), 13486–13491. 10.1073/pnas.0803076105 PubMed DOI PMC

Staats M, Arulandhu AJ, Gravendeel B, Holst-Jensen A, Scholtens I, Peelen T, Prins TW, & Kok E (2016). Advances in DNA metabarcoding for food and wildlife forensic species identification. Analytical and Bioanalytical Chemistry, 408(17), 4615–4630. 10.1007/s00216-016-9595-8 PubMed DOI PMC

Straub D, Blackwell N, Langarica-Fuentes A, Peltzer A, Nahnsen S, & Kleindienst S (2020). Interpretations of Environmental Microbial Community Studies Are Biased by the Selected 16S rRNA (Gene) Amplicon Sequencing Pipeline. Frontiers in Microbiology, 11, 550420. 10.3389/fmicb.2020.550420 PubMed DOI PMC

Taberlet P, Bonin A, Zinger L, & Coissac E (2018). Environmental DNA: For biodiversity research and monitoring. Oxford University Press. 10.1093/oso/9780198767220.001.0001 DOI

Taberlet P, Coissac E, Hajibabaei M, & Rieseberg LH (2012). Environmental dna. Molecular ecology, 21(8), 1789–1793. 10.1111/j.1365-294X.2012.05542.x PubMed DOI

Taberlet P, Coissac E, Pompanon F, Brochmann C, & Willerslev E (2012). Towards next‐generation biodiversity assessment using DNA metabarcoding. Molecular ecology, 21(8), 2045–2050. 10.1111/j.1365-294X.2012.05470.x PubMed DOI

Taberlet P, Coissac E, Pompanon F, Gielly L, Miquel C, Valentini A, … & Willerslev E (2007). Power and limitations of the chloroplast trn L (UAA) intron for plant DNA barcoding. Nucleic acids research, 35(3), e14–e14. 10.1093/nar/gkl938 PubMed DOI PMC

Tedersoo L, & Anslan S (2019). Towards PacBio‐based pan‐eukaryote metabarcoding using full‐length ITS sequences. Environmental Microbiology Reports, 11(5), 659–668. 10.1111/1758-2229.12776 PubMed DOI

Tedersoo L, Albertsen M, Anslan S, & Callahan B (2021). Perspectives and benefits of high-throughput long-read sequencing in microbial ecology. Applied and environmental microbiology, 87(17), e00626–21. 10.1128/AEM.00626-21 PubMed DOI PMC

Tedersoo L, Bahram M, Zinger L, Nilsson RH, Kennedy PG, Yang T, … & Mikryukov V (2022). Best practices in metabarcoding of fungi: From experimental design to results. Molecular ecology, 31(10), 2769–2795. 10.1111/mec.16460 PubMed DOI

Terrat S, Djemiel C, Journay C, Karimi B, Dequiedt S, Horrigue W, … & Ranjard L (2020). ReClustOR: a re‐clustering tool using an open‐reference method that improves operational taxonomic unit definition. Methods in Ecology and Evolution, 11(1), 168–180. 10.1111/2041-210X.13316 DOI

Thompson LR, Anderson SR, Den Uyl PA, Patin NV, Lim SJ, Sanderson G, & Goodwin KD (2022). Tourmaline: A containerized workflow for rapid and iterable amplicon sequence analysis using QIIME 2 and Snakemake. GigaScience, 11. giac066. 10.1093/gigascience/giac066 PubMed DOI PMC

Thomsen PF, & Sigsgaard EE (2019). Environmental DNA metabarcoding of wild flowers reveals diverse communities of terrestrial arthropods. Ecology and evolution, 9(4), 1665–1679. 10.1002/ece3.4809 PubMed DOI PMC

Vasar M, Davison J, Neuenkamp L, Sepp S-K, Young JPW, Moora M, & Öpik M (2021). User-friendly bioinformatics pipeline gDAT (graphical downstream analysis tool) for analysing rDNA sequences. Molecular Ecology Resources, 21(4), 1380–1392. 10.1111/1755-0998.13340 PubMed DOI

Vetrovský T, Baldrian P, & Morais D (2018). SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses. Bioinformatics , 34(13), 2292–2294. 10.1093/bioinformatics/bty071 PubMed DOI PMC

Vu D, Nilsson RH, & Verkley GJM (2022). Dnabarcoder: An open-source software package for analysing and predicting DNA sequence similarity cutoffs for fungal sequence identification. Molecular Ecology Resources. 10.1111/1755-0998.13651 PubMed DOI PMC

Wang Q, Garrity GM, Tiedje JM, & Cole JR (2007). Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and environmental microbiology, 73(16), 5261–5267. 10.1128/AEM.00062-07 PubMed DOI PMC

Weigand H, Beermann AJ, Čiampor F, Costa FO, Csabai Z, Duarte S, … & Ekrem T (2019). DNA barcode reference libraries for the monitoring of aquatic biota in Europe: Gap-analysis and recommendations for future work. Science of the Total Environment, 678, 499–524. 10.1016/j.scitotenv.2019.04.247 PubMed DOI

Westfall KM, Therriault TW, & Abbott CL (2020). A new approach to molecular biosurveillance of invasive species using DNA metabarcoding. Global Change Biology, 26(2), 1012–1022. 10.1111/gcb.14886 PubMed DOI

Wratten L, Wilm A, & Göke J (2021). Reproducible, scalable, and shareable analysis pipelines with bioinformatics workflow managers. Nature methods, 18(10), 1161–1168. 10.1038/s41592-021-01254-9 PubMed DOI

Zafeiropoulos H, Gargan L, Hintikka S, Pavloudi C, & Carlsson J (2021). The Dark mAtteR iNvestigator (DARN) tool: getting to know the known unknowns in COI amplicon data. Metabarcoding and Metagenomics, 5, e69657. 10.3897/mbmg.5.69657 DOI

Zafeiropoulos H, Viet HQ, Vasileiadou K, Potirakis A, Arvanitidis C, Topalis P, … & Pafilis E (2020). PEMA: a flexible Pipeline for Environmental DNA Metabarcoding Analysis of the 16S/18S ribosomal RNA, ITS, and COI marker genes. GigaScience, 9(3), giaa022. 10.1093/gigascience/giaa022 PubMed DOI PMC

Zinger L, Lionnet C, Benoiston AS, Donald J, Mercier C, & Boyer F (2021). metabaR: an R package for the evaluation and improvement of DNA metabarcoding data quality. Methods in Ecology and Evolution, 12(4), 586–592. 10.1111/2041-210X.13552 DOI

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...