Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action
Status PubMed-not-MEDLINE Jazyk angličtina Země Švýcarsko Médium electronic-ecollection
Typ dokumentu časopisecké články
PubMed
37808321
PubMed Central
PMC10558209
DOI
10.3389/fmicb.2023.1257002
Knihovny.cz E-zdroje
- Klíčová slova
- artificial intelligence, best practices, machine learning, microbiome, standards,
- Publikační typ
- časopisecké články MeSH
The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed to exploit the benefits of ML in this field fully. In particular, there is a need to establish "gold standard" protocols for conducting ML analysis experiments and improve interactions between microbiome researchers and ML experts. The Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome) COST Action CA18131 is a European network established in 2019 to promote collaboration between discovery-oriented microbiome researchers and data-driven ML experts to optimize and standardize ML approaches for microbiome analysis. This perspective paper presents the key achievements of ML4Microbiome, which include identifying predictive and discriminatory 'omics' features, improving repeatability and comparability, developing automation procedures, and defining priority areas for the novel development of ML methods targeting the microbiome. The insights gained from ML4Microbiome will help to maximize the potential of ML in microbiome research and pave the way for new and improved healthcare practices.
Biome Diagnostics GmbH Vienna Austria
BioSense Institute University of Novi Sad Novi Sad Serbia
Center for Mathematics and Applications NOVA School of Science and Technology Caparica Portugal
Chemistry and Pharmacy Department University of Sofia Sofia Bulgaria
Computational Biology International Centre for Genetic Engineering and Biotechnology Trieste Italy
Department of Biology University of Tirana Tirana Albania
Department of Clinical Science University of Bergen Bergen Norway
Department of Computer Networks and Systems Silesian University of Technology Gliwice Poland
Department of Computer Science University of Bari Aldo Moro Bari Italy
Department of Computer Science University of Crete Heraklion Greece
Department of Computing University of Turku Turku Finland
Department of Ecology Universität Innsbruck Innsbruck Austria
Department of Electrical and Electronic Engineering University College Cork Cork Ireland
Department of Microbiology Universität Innsbruck Innsbruck Austria
Faculty of Technical Sciences University of Novi Sad Novi Sad Serbia
Finnish Institute for Health and Welfare Helsinki Finland
Institute for Molecular Medicine Finland FIMM HiLIFE Helsinki Finland
Institute of Molecular and Cell Biology University of Tartu Tartu Estonia
Institute of Molecular Biology Slovak Academy of Sciences Bratislava Slovakia
Institute of Science and Technology Austria Klosterneuburg Austria
JADBio Gnosis DA S A Science and Technology Park of Crete Heraklion Greece
Nicolaus Copernicus University Torun Torun Poland
School of Microbiology and APC Microbiome Ireland University College Cork Cork Ireland
Ss Cyril and Methodius University Skopje North Macedonia
Systems Engineering Department Kharkiv National University of Radio Electronics Kharkiv Ukraine
Université Paris Saclay INRAE MetaGenoPolis Jouy en Josas France
University of Bergen Bergen Norway
University of Fribourg and Swiss Institute of Bioinformatics Fribourg Switzerland
Zobrazit více v PubMed
Ahlawat K., Chug A., Singh A. P. (2021). A novel hybrid sampling algorithm for solving class imbalance problem in big data. Adv. Data Sci. Adapt. Anal. 13:2150005. doi: 10.1142/S2424922X21500054 DOI
Anomaly J. (2017). Ethics, antibiotics, and public policy. Geo. JL Pub. Pol'y 15, 999–1016.
Arcila-Galvis J. E., Loria-Kohen V., Ramírez de Molina A., Carrillo de Santa Pau E., Marcos-Zambrano L. J. (2022). A comprehensive map of microbial biomarkers along the gastrointestinal tract for celiac disease patients. Front Microbiol. 13:956119. doi: 10.3389/fmicb.2022.956119 PubMed DOI PMC
Balech B., Brennan L., Carrillo de Santa Pau E., Cavalieri D., Coort S., D’Elia D., et al. . (2022). The future of food and nutrition in ELIXIR [version 1; peer review: 1 approved with reservations]. F1000Research 11:978. doi: 10.12688/f1000research.51747.1 DOI
Barbet P., Almeida M., Probul N., Baumbach J., Pons N., Plaza Onate F., et al. . (2022). Taxonomic profiles, functional profiles and manually curated metadata of human fecal metagenomes from public projects coming from colorectal cancer studies. Recherche Data Gouv, V5, UNF:6:Hif6zWkvCjqmOEJh2lhq0g== [fileUNF]. doi: 10.57745/7IVO3E DOI
Baxter N. T., Ruffin M. T., Rogers M. A., Schloss P. D. (2016). Microbiota-based model improves the sensitivity of fecal immunochemical test for detecting colonic lesions. Genome Med. 8:37. doi: 10.1186/s13073-016-0290-3 PubMed DOI PMC
Bidkhori G., Lee S., Edwards L. A., Chatelier E. L., Almeida M., Ezzamouri B., et al. . (2021). The Reactobiome unravels a new paradigm in human gut microbiome metabolism. bioRxiv 2021.02.01.428114 [Preprint]. Available at: https://www.biorxiv.org/content/10.1101/2021.02.01.428114v1 (Accessed June 28, 2023). DOI
Carrieri A. P., Haiminen N., Gardiner L., Murphy B., Mayes A. E., Paterson S., et al. . (2021). Explainable AI reveals changes in skin microbiome composition linked to phenotypic differences. Sci. Rep. 11, 1–18. doi: 10.1038/s41598-021-83922-6 PubMed DOI PMC
Cekikj M., Jakimovska Özdemir M., Kalajdzhiski S., Özcan O., Sezerman O. U. (2022). Understanding the role of the microbiome in cancer diagnostics and therapeutics by creating and utilizing ML models. Appl. Sci. 12:4094. doi: 10.3390/app12094094 DOI
Chen H., Lundberg S. M., Lee S. (2022). Explaining a series of models by propagating Shapley values. Nat. Commun. 13, 1–15. doi: 10.1038/s41467-022-31384-3 PubMed DOI PMC
Deutsch L., Debevec T., Millet G. P., Osredkar D., Opara S., Šket R., et al. . (2022). (2022) urine and fecal 1H-NMR metabolomes differ significantly between pre-term and full-term born physically fit healthy adult males. Meta 12:536. doi: 10.3390/metabo12060536 PubMed DOI PMC
Deutsch L., Osredkar D., Plavec J., Stres B. (2021). Spinal muscular atrophy after Nusinersen therapy: improved physiology in pediatric patients with no significant change in urine, serum, and liquor 1H-NMR metabolomes in comparison to an age-matched, healthy cohort. Meta 11:206. doi: 10.3390/metabo11040206 PubMed DOI PMC
Deutsch L., Stres B. (2021). The importance of objective stool classification in fecal 1H-NMR metabolomics: exponential increase in stool crosslinking is mirrored in systemic inflammation and associated to fecal acetate and methionine. Metabolites 11:172. doi: 10.3390/metabo11030172 PubMed DOI PMC
Di Stefano M., Santonocito S., Polizzi A., Mauceri R., Troiano G., Lo Giudice A., et al. . (2023). A reciprocal link between Oral, gut microbiota during periodontitis: the potential role of probiotics in reducing Dysbiosis-induced inflammation. Int. J. Mol. Sci. 24:1084. doi: 10.3390/ijms24021084 PubMed DOI PMC
Feldner-Busztin D., Firbas Nisantzis P., Edmunds S. J., Boza G., Racimo F., Gopalakrishnan S., et al. . (2023). Dealing with dimensionality: the application of machine learning to multi-omics data. Bioinformatics 39:2. doi: 10.1093/bioinformatics/btad021 PubMed DOI PMC
Gao Y., Şimşek Y., Gheysen E., Borman T., Li Y., Lahti L., et al. . (2023). miaSim: an R/Bioconductor package to easily simulate microbial community dynamics. Methods Ecol. Evol. 14, 1967–1980. doi: 10.1111/2041-210X.14129 DOI
Gloor G. B., Macklaim J. M., Pawlowsky-Glahn V., Egozcue J. J. (2017). Microbiome datasets are compositional: and this is not optional. Front. Microbiol. 8:2224. doi: 10.3389/fmicb.2017.02224 PubMed DOI PMC
Greenacre M., Blasco A. (2021). Compositional data analysis of microbiome and any-omics datasets: a validation of the additive Logratio transformation. Front. Microbiol. 12:727398. doi: 10.3389/fmicb.2021.727398 PubMed DOI PMC
Hernández Medina R., Kutuzova S., Nielsen K. N., Johansen J., Hansen L. H., Nielsen M., et al. . (2022). Machine learning and deep learning applications in microbiome research. ISME Commun. 2, 1–7. doi: 10.1038/s43705-022-00182-9 PubMed DOI PMC
Kim J., Kim J. (2018). The impact of imbalanced training data on machine learning for author name disambiguation. Scientometrics 117, 511–526. doi: 10.1007/s11192-018-2865-9 DOI
Knoppers B. M., Chadwick R. (2005). Human genetic research: emerging trends in ethics. Nat. Rev. Genet. 6, 75–79. doi: 10.1038/nrg1505 PubMed DOI
Lipton Z. C. (2016). The mythos of model interpretability. ArXiv. doi: 10.48550/arXiv.1606.03490 [Epub ahead of preprint]. DOI
Marcos-Zambrano L. J., Karaduzovic-Hadziabdic K., Loncar Turukalo T., Przymus P., Trajkovik V., Aasmets O., et al. . (2021). Applications of machine learning in human microbiome studies: a review on feature selection, biomarker identification, disease prediction and treatment. Front. Microbiol. 12:634511. doi: 10.3389/fmicb.2021.634511 PubMed DOI PMC
Manor O., Dai C. L., Kornilov S. A., Smith B., Price N. D., Lovejoy J. C., et al. . (2020). Health and disease markers correlate with gut microbiome composition across thousands of people. Nat. Commun. 11, 1–12. doi: 10.1038/s41467-020-18871-1 PubMed DOI PMC
Marcos-Zambrano Judith L. (2022). 16S rRNA sequencing gene datasets for CRC data (1.0.0) [data set]. Zenodo. doi: 10.5281/zenodo.7382814 DOI
McMurdie P. J., Holmes S. (2014). Waste not, want not: why rarefying microbiome data is inadmissible. PLoS Comput. Biol. 10:e1003531. doi: 10.1371/journal.pcbi.1003531 PubMed DOI PMC
Mehrabi N., Morstatter F., Saxena N., Lerman K., Galstyan A. (2021). A survey on bias and fairness in machine learning. ACM Comput. Surv. 54, 1–35. doi: 10.1145/3457607 DOI
Molnar C. (2022). Interpretable machine learning: a guide for making black box models explainable. 2nd Edn Available at: https://christophm.github.io/interpretable-ml-book/.
Moreno-Indias I., Lahti L., Nedyalkova M., Elbere I., Roshchupkin G., Adilovic M., et al. . (2021). Statistical and machine learning techniques in human microbiome studies: contemporary challenges and solutions. Front. Microbiol. 12:635781. doi: 10.3389/fmicb.2021.635781 PubMed DOI PMC
Papoutsoglou G., Tarazona S., Lopes M. B., Klammsteiner T., Ibrahimi E., Eckenberger J., et al. . (2023). Machine learning approaches in microbiome research: challenges and best practices. Front. Microbiol. Sec. Systems Microbiol. 14. doi: 10.3389/fmicb.2023.1261889 PubMed DOI PMC
Pasolli E., Truong D. T., Malik F., Waldron L., Segata N. (2016). Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput. Biol. 12:e1004977. doi: 10.1371/journal.pcbi.1004977 PubMed DOI PMC
Rosario D., Bidkhori G., Lee S., Bedarf J., Hildebrand F., Le Chatelier E., et al. . (2021). Systematic analysis of gut microbiome reveals the role of bacterial folate and homocysteine metabolism in Parkinson's disease. Cell Rep. 34:108807. doi: 10.1016/j.celrep.2021.108807 PubMed DOI
Ruuskanen M. O., Erawijantari P. P., Havulinna A. S., Liu Y., Méric G., Tuomilehto J., et al. . (2022). Gut microbiome composition is predictive of incident type 2 diabetes in a population cohort of 5,572 Finnish adults. Diabetes Care 45, 811–818. doi: 10.2337/dc21-2358 PubMed DOI PMC
Rynazal R., Fujisawa K., Shiroma H., Salim F., Mizutani S., Shiba S., et al. . (2023). Leveraging explainable AI for gut microbiome-based colorectal cancer classification. Genome Biol. 24:21. doi: 10.1186/s13059-023-02858-4 PubMed DOI PMC
Salosensaari A., Laitinen V., Havulinna A. S., Meric G., Cheng S., Perola M., et al. . (2021). Taxonomic signatures of cause-specific mortality risk in human gut microbiome. Nat. Commun. 12, 1–8. doi: 10.1038/s41467-021-22962-y PubMed DOI PMC
Schloss P. D. (2023) Rarefaction is currently the best approach to control for uneven sequencing effort in amplicon sequence analyses. bioRxiv [Epub ahead of preprint]. doi: 10.1101/2023.06.23.546313 PubMed DOI PMC
Shabani M., Borry P. (2018). Rules for processing genetic data for research purposes in view of the new EU general data protection regulation. Eur. J. Hum. Genet. 26, 149–156. doi: 10.1038/s41431-017-0045-7 PubMed DOI PMC
Tonkovic P., Kalajdziski S., Zdravevski E., Lameski P., Corizzo R., Pires I. M., et al. . (2020). Literature on applied machine learning in metagenomic classification: a scoping review. Biology 9:453. doi: 10.3390/biology9120453 PubMed DOI PMC
Tsamardinos I., Charonyktakis P., Papoutsoglou G., Borboudakis G., Lakiotaki K., Zenklusen J. C., et al. . (2022). Just add data: automated predictive modeling for knowledge discovery and feature selection. NPJ Precision Oncol. 6:38. doi: 10.1038/s41698-022-00274-8 PubMed DOI PMC
Vilne B., Ķibilds J., Siksna I., Lazda I., Valciņa O., Krūmiņa A. (2022). Could artificial intelligence/machine learning and inclusion of diet-gut microbiome interactions improve disease risk prediction? Case study: coronary artery disease. Front. Microbiol. 13:627892. doi: 10.3389/fmicb.2022.627892 PubMed DOI PMC
Voigt A. Y., Costea P. I., Kultima J. R., Li S. S., Zeller G., Sunagawa S., et al. . (2015). Temporal and technical variability of human gut metagenomes. Genome Biol. 16:73. doi: 10.1186/s13059-015-0639-8 PubMed DOI PMC
Walsh I., Fishman D., Titma T., Pollastri G., Harrow J., Psomopoulos F. E., et al. . (2021). DOME: recommendations for supervised machine learning validation in biology. Nat. Methods 18, 1122–1127. doi: 10.1038/s41592-021-01205-4 PubMed DOI
Weiss S., Xu Z. Z., Peddada S., Amir A., Bittinger K., Gonzalez A., et al. . (2017). Normalization and microbial differential abundance strategies depend upon data characteristics. Microbiome 5:27. doi: 10.1186/s40168-017-0237-y PubMed DOI PMC
Zackular J. P., Rogers M. A., Ruffin M. T., 4th, Schloss P. D. (2014). The human gut microbiome as a screening tool for colorectal cancer. Cancer Prev. Res. (Phila.) 7, 1112–1121. doi: 10.1158/1940-6207.CAPR-14-0129 PubMed DOI PMC
Zeller G., Tap J., Voigt A. Y., Sunagawa S., Kultima J. R., Costea P. I., et al. . (2014). Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol. Syst. Biol. 10:766. doi: 10.15252/msb.20145645 PubMed DOI PMC
Overview of data preprocessing for machine learning applications in human microbiome research