Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples
Language English Country England, Great Britain Media electronic
Document type Journal Article, Research Support, Non-U.S. Gov't
PubMed
28506246
PubMed Central
PMC5430618
DOI
10.1186/s12864-017-3721-7
PII: 10.1186/s12864-017-3721-7
Knihovny.cz E-resources
- Keywords
- Assembly, Metagenomics, NGS analysis, Parallel processing, Viral dark matter, Viromes, Virus, Visualization,
- MeSH
- Genetic Variation MeSH
- Genomics methods MeSH
- Internet * MeSH
- Humans MeSH
- Microbiota genetics MeSH
- Software * MeSH
- Viruses genetics MeSH
- High-Throughput Nucleotide Sequencing * MeSH
- Check Tag
- Humans MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
BACKGROUND: Next generation sequencing (NGS) technology allows laboratories to investigate virome composition in clinical and environmental samples in a culture-independent way. There is a need for bioinformatic tools capable of parallel processing of virome sequencing data by exactly identical methods: this is especially important in studies of multifactorial diseases, or in parallel comparison of laboratory protocols. RESULTS: We have developed a web-based application allowing direct upload of sequences from multiple virome samples using custom parameters. The samples are then processed in parallel using an identical protocol, and can be easily reanalyzed. The pipeline performs de-novo assembly, taxonomic classification of viruses as well as sample analyses based on user-defined grouping categories. Tables of virus abundance are produced from cross-validation by remapping the sequencing reads to a union of all observed reference viruses. In addition, read sets and reports are created after processing unmapped reads against known human and bacterial ribosome references. Secured interactive results are dynamically plotted with population and diversity charts, clustered heatmaps and a sortable and searchable abundance table. CONCLUSIONS: The Vipie web application is a unique tool for multi-sample metagenomic analysis of viral data, producing searchable hits tables, interactive population maps, alpha diversity measures and clustered heatmaps that are grouped in applicable custom sample categories. Known references such as human genome and bacterial ribosomal genes are optionally removed from unmapped ('dark matter') reads. Secured results are accessible and shareable on modern browsers. Vipie is a freely available web-based tool whose code is open source.
Fimlab Laboratories Pirkanmaa Hospital District Tampere Finland
School of Social Sciences University of Tampere Kalevantie 4 33100 Tampere Finland
See more in PubMed
The Human Microbiome Consortium Structure, function and diversity of the healthy human microbiome. Nature. 2012;486(7402):207–14. doi: 10.1038/nature11234. PubMed DOI PMC
Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI. The human microbiome project. Nature. 2007;449(7164):804–10. doi: 10.1038/nature06244. PubMed DOI PMC
Houldcroft CJ, Beale MA, Breuer J. Clinical and biological insights from viral genome sequencing. Nat Rev Microbiol. 2017;15(3):183–192. doi: 10.1038/nrmicro.2016.182. PubMed DOI PMC
Tringe SG, Rubin EM. Metagenomics: DNA sequencing of environmental samples. Nat Rev Genet. 2005;6:805–814. doi: 10.1038/nrg1709. PubMed DOI
Shapton TJ. An introduction to the analysis of shotgun metagenomic data. Front Plant Sci. 2014;5:209. PubMed PMC
Flygare S, Simon K, et al. Taxonomer: an interactive metagenomics analysis portal for universal pathogen detection and host mRNA expression profiling. Genome Biol. 2016;201617:111. doi: 10.1186/s13059-016-0969-1. PubMed DOI PMC
Yamashita A, et al. VirusTAP: viral genome-targeted assembly pipeline. Front Microbiol. 2016;7:32. PubMed PMC
Wommack KE, Bhavsar J, et al. VIROME: a standard operating procedure for analysis of viral metagenome sequences. Stand Genomic Sci. 2012;6(3):427–39. doi: 10.4056/sigs.2945050. PubMed DOI PMC
Roux S, Faubladier M, et al. Metavir: a web server dedicated to virome analysis. Bioinformatics. 2011;27(21):3074–5. doi: 10.1093/bioinformatics/btr519. PubMed DOI
Roux S, et al. Metavir 2: new tools for viral metagenome comparison and assembled virome analysis. BMC Bioinf. 2014;15:76. doi: 10.1186/1471-2105-15-76. PubMed DOI PMC
Rampelli S, Soverini M, et al. ViromeScan: a new tool for metagenomic viral community profiling. BMC Genomics. 2016;17:165. doi: 10.1186/s12864-016-2446-3. PubMed DOI PMC
Fosso B. et al. MetaShot: an accurate workflow for taxon classification of host-associated microbiome from shotgun metagenomic data. Bioinform. 2017. doi: 10.1093/bioinformatics/btx036. PubMed PMC
Afgan E, Taylor J, Anton Nekrutenko A, Goecks J, et al. The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic Acids Res. 2016;44(W1):W3–W10. doi: 10.1093/nar/gkw343. PubMed DOI PMC
Blankenberg D, the Galaxy Team. Taylor J, Nekrutenko A, et al. Dissemination of scientific software with galaxy ToolShed. Genome Biol. 2014;15:403. doi: 10.1186/gb4161. PubMed DOI PMC
Zerbina DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–829. doi: 10.1101/gr.074492.107. PubMed DOI PMC
Namiki T, Hachiya T, Tanaka H, Sakakibara Y. MetaVelvet : An extension of Velvet assembler to de novo metagenome assembly from short sequence reads. Nucleic Acids Res. 2012;40(20):e155. doi: 10.1093/nar/gks678. PubMed DOI PMC
Peng Y, et al. IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics. 2013;28:1420–1. doi: 10.1093/bioinformatics/bts174. PubMed DOI
Li D, et al. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31(10):1674–6. doi: 10.1093/bioinformatics/btv033. PubMed DOI
Simpson K, et al. ABySS: A parallel assembler for short read sequence data. Genome Res. 2009;19(6):1117–1123. doi: 10.1101/gr.089532.108. PubMed DOI PMC
Paszkiewicz K, Studholme DJ. De novo assembly of short sequence reads. Brief Bioinform. 2010;11(5):457–472. doi: 10.1093/bib/bbq020. PubMed DOI
Tritt A, Eisen JA, Facciotti MT, Darling AE. An integrated pipeline for de novo assembly of microbial genomes. PLoS One. 2012;7(9):e42304. doi: 10.1371/journal.pone.0042304. PubMed DOI PMC
Li Y, et al. VIP: an integrated pipeline for metagenomics of virus identification and discovery. Sci Rep. 2016;6:23774. doi: 10.1038/srep23774. PubMed DOI PMC
Altschul SF, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403. doi: 10.1016/S0022-2836(05)80360-2. PubMed DOI
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–1760. doi: 10.1093/bioinformatics/btp324. PubMed DOI PMC
Szymanski M, Zielezinski A, et al. 5SRNAdb: an information resource for 5S ribosomal RNAs. Nucleic Acids Res. 2016;44(D1):D180–3. doi: 10.1093/nar/gkv1081. PubMed DOI PMC
Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010;26(19):2460–1. doi: 10.1093/bioinformatics/btq461. PubMed DOI
Cock PA, Antao T, Chang JT, Bradman BA, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, de Hoon MJL. Biopython: freely available python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25:1422–1423. doi: 10.1093/bioinformatics/btp163. PubMed DOI PMC
Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer; 2009.
Kimura H, et al. A food-borne outbreak of gastroenteritis due to genotype G1P[8] rotavirus among adolescents in Japan. Microbiol Immunol. 2014;58(9):536–9. doi: 10.1111/1348-0421.12176. PubMed DOI
DNA Data bank of Japan http://getentry.ddbj.nig.ac.jp/(DRA004165) Accessed 01 Dec 2016.
Rodríguez-Diaz J, et al. Presence of human enteric viruses in the stools of healthy Malawian 6-month-old infants. J Pediatr Gastroenterol Nutr. 2014;58(4):502–4. doi: 10.1097/MPG.0000000000000215. PubMed DOI
Mangani C, et al. Effect of complementary feeding with lipid-based nutrient supplements and corn-soy blend on the incidence of stunting and linear growth among 6- to 18-month-old infants and children in rural Malawi. Matern Child Nutr. 2015;11(Suppl 4):132–43. doi: 10.1111/mcn.12068. PubMed DOI PMC
Vipie project SourceForge https://sourceforge.net/projects/vipie/files/data/Accessed 15 Mar 2017
Shannon CE. A mathematical theory of communication. Bell Syst Tech J. 1948;27:379–423 and 623–656. doi: 10.1002/j.1538-7305.1948.tb01338.x. DOI
Simpson EH. Measurement of diversity. Nature. 1949;163:688. doi: 10.1038/163688a0. DOI
Dutilh BE, Edwards RA, et al. A highly abundant bacteriophage discovered in the unknown sequences of human faecal metagenomes. Nat Commun. 2014;5:4498. doi: 10.1038/ncomms5498. PubMed DOI PMC
NIH Human Microbiome Project website. http://www.hmpdacc.org/HMASM/HMASM-690.csv. Accessed 01 Jan 2017
Wylie KM, Mihindukulasuriya KA, Zhou Y, Sodergren E, Storch GA, Weinstock GM. Metagenomic analysis of double-stranded DNA viruses in healthy adults. BMC Biol. 2014;12:71. doi: 10.1186/s12915-014-0071-7. PubMed DOI PMC
Huang W, et al. ART: a next-generation sequencing read simulator. Bioinformatics. 2012;28:593–594. doi: 10.1093/bioinformatics/btr708. PubMed DOI PMC
McMurdie PJ, Holmes S. Phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS One. 2013;8(4):e61217. doi: 10.1371/journal.pone.0061217. PubMed DOI PMC
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550. doi: 10.1186/s13059-014-0550-8. PubMed DOI PMC
Audano P, Vannberg F. KAnalyze: a fast versatile pipelined k-mer toolkit. Bioinformatics. 2014;30:2070–2. doi: 10.1093/bioinformatics/btu152. PubMed DOI PMC
Alonso-Alemany D, et al. Further steps in TANGO: improved taxonomic assignment in metagenomics. Bioinformatics. 2014;30(1):17–23. doi: 10.1093/bioinformatics/btt256. PubMed DOI
Sayers EW, et al. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2009;37(Database issue):D5–15. doi: 10.1093/nar/gkn741. PubMed DOI PMC