• This record comes from PubMed

Detection of single nucleotide polymorphisms in virus genomes assembled from high-throughput sequencing data: large-scale performance testing of sequence analysis strategies

. 2023 ; 11 () : e15816. [epub] 20230816

Language English Country United States Media electronic-ecollection

Document type Journal Article, Research Support, Non-U.S. Gov't

Recent developments in high-throughput sequencing (HTS) technologies and bioinformatics have drastically changed research in virology, especially for virus discovery. Indeed, proper monitoring of the viral population requires information on the different isolates circulating in the studied area. For this purpose, HTS has greatly facilitated the sequencing of new genomes of detected viruses and their comparison. However, bioinformatics analyses allowing reconstruction of genome sequences and detection of single nucleotide polymorphisms (SNPs) can potentially create bias and has not been widely addressed so far. Therefore, more knowledge is required on the limitations of predicting SNPs based on HTS-generated sequence samples. To address this issue, we compared the ability of 14 plant virology laboratories, each employing a different bioinformatics pipeline, to detect 21 variants of pepino mosaic virus (PepMV) in three samples through large-scale performance testing (PT) using three artificially designed datasets. To evaluate the impact of bioinformatics analyses, they were divided into three key steps: reads pre-processing, virus-isolate identification, and variant calling. Each step was evaluated independently through an original, PT design including discussion and validation between participants at each step. Overall, this work underlines key parameters influencing SNPs detection and proposes recommendations for reliable variant calling for plant viruses. The identification of the closest reference, mapping parameters and manual validation of the detection were recognized as the most impactful analysis steps for the success of the SNPs detections. Strategies to improve the prediction of SNPs are also discussed.

Biology Centre CAS Ceske Budejovice Czech Republic

Citrus Research International Matieland South Africa

Crop Research Institute Praha Czech Republic

Department of Biosystems Science and Engineering ETH Zurich Basel 4058 Switzerland

Department of Cell Biology and Genetics Faculty of Science Palacký University Olomouc Olomouc Czech Republic

Department of Chemistry and Biotechnology Tallinn University of Technology Tallinn Estonia

Department of Genetics Stellenbosch University Matieland South Africa

Department of Plant Protection Faculty of Agriculture Eskişehir Osmangazi University Eskişehir Turkey

Fisheries and Food Plant Sciences Unit Flanders Research Institute for Agriculture Merelbeke Belgium

GAFL Institut National de la Recherche pour l'Agriculture l'Alimentation et l'Environnement Montfavet France

Laboratory of Plant Pathology TERRA Gembloux Agro Bio Tech University of Liège Gembloux Belgium

Laboratory of Statistics Computer Science and Modelling Applied to Bioengineering TERRA Gembloux Agro Bio Tech Teaching and Research Centre University of Liège Gembloux Belgium

Mendeleum Institute of Genetics Faculty of Horticulture Mendel University in Brno Lednice Czech Republic

Natural Resources Institute Finland Helsinki Finland

Pathologie Végétale Institut National de la Recherche pour l'Agriculture l'Alimentation et l'Environnement Montfavet France

Plant Production and Technologies Department Ayhan Şahenk Faculty of Agricultural Science and Technologies Niğde Ömer Halisdemir University Niğde Turkey

Plant Protection Department Agricultural Faculty Hatay Mustafa Kemal University Hatay Turkey

Plant Protection Department Agroscope Nyon Switzerland

Plant Protection Department Faculty of Agriculture University of Maragheh Maragheh Iran

Swiss Institute of Bioinformatics Basel Switzerland

Wageningen University and Research Wageningen The Netherlands

See more in PubMed

Barbitoff YA, Abasov R, Tvorogova VE, Glotov AS, Predeus AV. Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery. BMC Genomics. 2022;23(1):1–17. doi: 10.1186/s12864-022-08365-3. PubMed DOI PMC

Bomba L, Walter K, Soranzo N. The impact of rare and low-frequency genetic variants in common disease. Genome Biology. 2017;18(1):1–17. doi: 10.1186/s13059-017-1212-4. PubMed DOI PMC

Bordería AV, Isakov O, Moratorio G, Henningsson R, Agüera-González S, Organtini L, Gnädig NF, Blanc H, Alcover A, Hafenstein S, Fontes M, Shomron N, Vignuzzi M. Group selection and contribution of minority variants during virus adaptation determines virus fitness and phenotype. PLOS Pathogens. 2015;11(5):e1004838. doi: 10.1371/journal.ppat.1004838. PubMed DOI PMC

Černi S, Ruščić J, Nolasco G, Gatin Z, Krajačić M, Škorić D. Stem pitting and seedling yellows symptoms of Citrus tristeza virus infection may be determined by minor sequence variants. Virus Genes. 2008;36(1):241–249. doi: 10.1007/s11262-007-0183-z. PubMed DOI

Clevenger J, Chavarro C, Pearl SA, Ozias-Akins P, Jackson SA. Single nucleotide polymorphism identification in polyploids: a review, example, and recommendations. Molecular Plant. 2015;8(6):831–846. doi: 10.1016/j.molp.2015.02.002. PubMed DOI

Deng ZL, Dhingra A, Fritz A, Götting J, Münch PC, Steinbrück L, Schulz TF, Ganzenmüller T, McHardy AC. Evaluating assembly and variant calling software for strain-resolved analysis of large DNA viruses. Briefings in Bioinformatics. 2021;22(3):1–12. doi: 10.1093/bib/bbaa123. PubMed DOI PMC

Domingo E, Perales C. Viral quasispecies. PLOS Genetics. 2019;15(10):e1008271. doi: 10.1371/journal.pgen.1008271. PubMed DOI PMC

Elena SF, Fraile A, García-Arenal F. Evolution and emergence of plant viruses. Virus Structure and Assembly. 2014;88:161–191. doi: 10.1016/B978-0-12-800098-4.00003-9. PubMed DOI

Gaafar YZA, Westenberg M, Botermans M, László K, De Jonghe K, Foucart Y, Ferretti L, Kutnjak D, Pecman A, Mehle N, Kreuze J, Muller G, Vakirlis N, Beris D, Varveri C, Ziebell H. Interlaboratory comparison study on ribodepleted total RNA high-throughput sequencing for plant virus diagnostics and bioinformatic competence. Pathogens. 2021;10(9):1174. doi: 10.3390/pathogens10091174. PubMed DOI PMC

Gibbs MJ, Weiller GF. Evidence that a plant virus switched hosts to infect a vertebrate and then recombined with a vertebrate-infecting virus. Proceedings of the National Academy of Sciences of the United States of America. 1999;96(14):8022. doi: 10.1073/PNAS.96.14.8022. PubMed DOI PMC

Guirao-Rico S, González J. Benchmarking the performance of Pool-seq SNP callers using simulated and real sequencing data. Molecular Ecology Resources. 2021;21(4):1216. doi: 10.1111/1755-0998.13343. PubMed DOI PMC

Hirabara SM, Serdan TDA, Gorjao R, Masi LN, Pithon-Curi TC, Covas DT, Curi R, Durigon EL. SARS-COV-2 variants: differences and potential of immune evasion. Frontiers in Cellular and Infection Microbiology. 2022;11:1401. doi: 10.3389/fcimb.2021.781429. PubMed DOI PMC

Huang W, Li L, Myers JR, Marth GT. ART: a next-generation sequencing read simulator. Bioinformatics. 2012;28(4):593. doi: 10.1093/bioinformatics/btr708. PubMed DOI PMC

Koboldt DC. Best practices for variant calling in clinical sequencing. Genome Medicine. 2020;12(1):1–13. doi: 10.1186/s13073-020-00791-w. PubMed DOI PMC

Krishnamurthy SR, Wang D. Origins and challenges of viral dark matter. Virus Research. 2017;239(3):136–142. doi: 10.1016/j.virusres.2017.02.002. PubMed DOI

Kutnjak D, Elena SF, Ravnikar M. Time-sampled population sequencing reveals the interplay of selection and genetic drift in experimental evolution of potato virus Y. Journal of Virology. 2017;91(16):e00690-17. doi: 10.1128/JVI.00690-17. PubMed DOI PMC

Kutnjak D, Rupar M, Gutierrez-Aguirre I, Curk T, Kreuze JF, Ravnikar M. Deep sequencing of virus-derived small interfering RNAs and RNA from viral particles shows highly similar mutational landscapes of a plant virus population. Journal of Virology. 2015;89(9):4760–4769. doi: 10.1128/JVI.03685-14. PubMed DOI PMC

Lebas B, Adams I, Al Rwahnih M, Baeyen S, Bilodeau GJ, Blouin AG, Boonham N, Candresse T, Chandelier A, De Jonghe K, Fox A, Gaafar YZA, Gentit P, Haegeman A, Ho W, Hurtado-Gonzales O, Jonkers W, Kreuze J, Kutjnak D, Landa B, Liu M, Maclot F, Malapi-Wight M, Maree HJ, Martoni F, Mehle N, Minafra A, Mollov D, Moreira A, Nakhla M, Petter F, Piper AM, Ponchart J, Rae R, Remenant B, Rivera Y, Rodoni B, Roenhorst JW, Rollin J, Saldarelli P, Santala J, Souza-Richards R, Spadaro D, Studholme DJ, Sultmanis S, van der Vlugt R, Tamisier L, Trontin C, Vazquez-Iglesias I, Vicente CSL, Vossenberg BTLH, Wetzel T, Ziebell H, Massart S. Facilitating the adoption of high-throughput sequencing technologies as a plant pest diagnostic test in laboratories: a step-by-step description. EPPO Bulletin. 2022;52(2):394–418. doi: 10.1111/epp.12863. DOI

Massart S, Chiumenti M, Jonghe KDe, Glover R, Haegeman A, Koloniuk I, Komínek P, Kreuze J, Kutnjak D, Lotos L, Maclot F, Maliogka V, Maree HJ, Olivier T, Olmos A, Pooggin MM, Reynard JS, Ruiz-García AB, Safarova D, Schneeberger PHH, Sela N, Turco S, Vainio EJ, Varallyay E, Verdin E, Westenberg M, Brostaux Y, Candresse T. Virus detection by high-throughput sequencing of small RNAs: large-scale performance testing of sequence analysis strategies. Phytopathology. 2019;109(3):488–497. doi: 10.1094/PHYTO-02-18-0067-R. PubMed DOI

Nguyen NTT, Contreras-Moreira B, Castro-Mondragon JA, Santana-Garcia W, Ossio R, Robles-Espinoza CD, Bahin M, Collombet S, Vincens P, Thieffry D, van Helden J, Medina-Rivera A, Thomas-Chollier M. RSAT 2018: regulatory sequence analysis tools 20th anniversary. Nucleic Acids Research. 2018;46(W1):W209–W214. doi: 10.1093/nar/gky317. PubMed DOI PMC

Nyirakanani C, Tamisier L, Bizimana JP, Rollin J, Nduwumuremyi A, de Paul Bigirimana V, Selmi I, Lasois L, Vanderschuren H, Massart S. Going beyond consensus genome sequences: an innovative SNP-based methodology reconstructs different Uganda cassava brown streak virus haplotypes geographically clustered at the country-wide level. Virus Evolution. in press. PubMed PMC

Pappas N, Roux S, Hölzer M, Lamkiewicz K, Mock F, Marz M. Virus bioinformatics. Encyclopedia of Virology. 2021;27:124–132. doi: 10.1016/B978-0-12-814515-9.00034-5. DOI

Ramesh S, Govindarajulu M, Parise RS, Neel L, Shankar T, Patel S, Lowery P, Smith F, Dhanasekaran M, Moore T. Emerging SARS-CoV-2 variants: a review of its mutations, its implications and vaccine efficacy. Vaccines. 2021;9(10):1195. doi: 10.3390/VACCINES9101195. PubMed DOI PMC

Rubio L, Galipienso L, Ferriol I. Detection of plant viruses and disease management: relevance of genetic diversity and evolution. Frontiers in Plant Science. 2020;11:1092. doi: 10.3389/FPLS.2020.01092/BIBTEX. PubMed DOI PMC

Simon-Loriere E, Holmes EC. Why do RNA viruses recombine? Nature Reviews Microbiology. 2011;9(8):617. doi: 10.1038/nrmicro2614. PubMed DOI PMC

Tamisier L, Haegeman A, Foucart Y, Fouillien N, Al Rwahnih M, Buzkan N, Candresse T, Chiumenti M, De Jonghe K, Lefebvre, Massart S. Semi-artificial datasets as a resource for validation of bioinformatics pipelines for plant virus detection. Peer Community Journal. 2021;1(3):533. doi: 10.24072/pcjournal.62. DOI

Tromas N, Zwart MP, Maïté P, Elena SF. Estimation of the in vivo recombination rate for a plant RNA virus. Journal of General Virology. 2014;95(Pt 3):724–732. doi: 10.1099/vir.0.060822-0. PubMed DOI

Zheng Y, Gao S, Padmanabhan C, Li R, Galvez M, Gutierrez D, Fuentes S, Ling KS, Kreuze J, Fei Z. VirusDetect: an automated pipeline for efficient virus discovery using deep sequencing of small RNAs. Virology. 2017;500:130–138. doi: 10.1016/j.virol.2016.10.017. PubMed DOI

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...