Galaxy QCxMS for straightforward semi-empirical quantum mechanical EI-MS prediction
Status PubMed-not-MEDLINE Language English Country China Media electronic-ecollection
Document type Journal Article
PubMed
40661852
PubMed Central
PMC12257954
DOI
10.46471/gigabyte.160
PII: 160
Knihovny.cz E-resources
- Publication type
- Journal Article MeSH
High-performance computing (HPC) environments are crucial for computational research, including quantum chemistry (QC), but pose challenges for non-expert users. Researchers with limited computational knowledge struggle to utilise domain-specific software and access mass spectra prediction for in silico annotation. Here, we provide a robust workflow that leverages interoperable file formats for molecular structures to ensure integration across various QC tools. The quantum chemistry package for mass spectral predictions after electron ionization or collision-induced dissociation has been integrated into the Galaxy platform, enabling automated analysis of fragmentation mechanisms. The extended tight binding quantum chemistry package, chosen for its balance between accuracy and computational efficiency, provides molecular geometry optimisation. A Docker image encapsulates the necessary software stack. We demonstrated the workflow for four molecules, highlighting the scalability and efficiency of our solution via runtime performance analysis. This work shows how non-HPC users can make these predictions effortlessly, using advanced computational tools without needing in-depth expertise.
See more in PubMed
Aksenov AA, Da Silva R, Knight R et al. Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem., 2017; 1: 0054. doi: 10.1038/s41570-017-0054. DOI
David A, Chaker J, Price EJ et al. Towards a comprehensive characterisation of the human internal chemical exposome: challenges and perspectives. Environ. Int., 2021; 156: 106630. doi: 10.1016/j.envint.2021.106630. PubMed DOI
Chao A, Al-Ghoul H, McEachran AD et al. In silico MS/MS spectra for identifying unknowns: a critical examination using CFM-ID algorithms and ENTACT mixture samples. Anal. Bioanal. Chem., 2020; 412: 1303–1315. doi: 10.1007/s00216-019-02351-7. PubMed DOI PMC
Bremer PL, Vaniya A, Kind T et al. How well can we predict mass spectra from structures? Benchmarking competitive fragmentation modeling for metabolite identification on untrained tandem mass spectra. J. Chem. Inf. Model., 2022; 62: 4049–4056. doi: 10.1021/acs.jcim.2c00936. PubMed DOI PMC
Grimme S. . Towards first principles calculation of electron impact mass spectra of molecules. Angew. Chem. Int. Ed., 2013; 52(24): 6306–6312. doi: 10.1002/anie.201300158. PubMed DOI
Koopman J, Grimme S. . Calculation of electron ionization mass spectra with semiempirical GFNn-xTB methods. ACS Omega, 2019; 4(12): 15120–15133. doi: 10.1021/acsomega.9b02011. PubMed DOI PMC
Koopman J, Grimme S. . From QCEIMS to QCxMS: a tool to routinely calculate CID mass spectra using molecular dynamics. J. Am. Soc. Mass Spectrom., 2021; 32(7): 1735–1751. doi: 10.1021/jasms.1c00098. PubMed DOI
The Galaxy Community . The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update. Nucleic Acids Res., 2024; 05: gkae410. doi: 10.1093/nar/gkae410. PubMed DOI PMC
Batut B, Hiltemann S, Bagnacani A et al. Community-driven data analysis training for biology. Cell Syst., 2018; 6(6): 752–758.e1. doi: 10.1016/j.cels.2018.05.012. PubMed DOI PMC
Rasche H, Hyde C, Davis J et al. Training infrastructure as a service. GigaScience, 2022; 12: giad048. doi: 10.1093/gigascience/giad048/7217081. PubMed DOI PMC
Hiltemann S, Rasche H, Gladman S et al. Galaxy training: a powerful framework for teaching! PLoS Comput. Biol., 2023; 19(1): e1010752. doi: 10.1371/journal.pcbi.1010752. PubMed DOI PMC
Grimme S, Bannwarth C, Shushkov P. . A robust and accurate tight-binding quantum chemical method for structures, vibrational frequencies, and noncovalent interactions of large molecular systems parametrized for all spd-Block elements ( PubMed DOI
Bannwarth C, Ehlert S, Grimme S. . GFN2-xTB—an accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions. J. Chem. Theory Comput., 2019; 15(3): 1652–1671. doi: 10.1021/acs.jctc.8b01176. PubMed DOI
Hecht H, Rojas WY, Ahmad Z et al. Quantum chemistry-based prediction of electron ionization mass spectra for environmental chemicals. Anal. Chem., 2024; 96(33): 13652–13662. doi: 10.1021/acs.analchem.4c02589. PubMed DOI PMC
Barker M, Hong NPC, Katz DS et al. Introducing the FAIR principles for research software. Sci. Data, 2022; 9: 622. doi: 10.1038/s41597-022-01710-x. PubMed DOI PMC
Bray SA, Lucas X, Kumar A et al. The ChemicalToolbox: reproducible, user-friendly cheminformatics analysis on the Galaxy platform. J. Cheminform., 2020; 12: 40. doi: 10.1186/s13321-020-00442-7. PubMed DOI PMC
O’Boyle NM, Banck M, James CA et al. Open babel: an open chemical toolbox. J. Cheminform., 2011; 3: 33. doi: 10.1186/1758-2946-3-33. PubMed DOI PMC
Nekrutenko A. . Using dataset collections (Galaxy Training Materials). 2024; https://training.galaxyproject.org/training-material/topics/galaxy-interface/tutorials/collections/tutorial.html (Online; Accessed 24 July 2024).
Hecht H, Troják M, Čech M et al. RECETOX/galaxytools: v0.4.0. Zenodo. 2024; 10.5281/zenodo.12820724. DOI
RECETOX . Galaxy tools for Untargeted Mass Spectrometry Analysis. GitHub. https://github.com/RECETOX/galaxytools.
Ison J, Kalaš M, Jonassen I et al. EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats. Bioinformatics, 2013; 29: 1325–1332. doi: 10.1093/bioinformatics/btt113. PubMed DOI PMC
Ison J, Ienasescu H, Chmura P et al. The bio.tools registry of software tools and data resources for the life sciences. Genome Biol., 2019; 20: 164. doi: 10.1186/s13059-019-1772-6. PubMed DOI PMC
da Veiga Leprevost F, Grüning BA, Aflitos SA et al. BioContainers: an open-source and community-driven framework for software standardization. Bioinformatics, 2017; 33(16): 2580–2582. doi: 10.1093/bioinformatics/btx192. PubMed DOI PMC
Grüning B, Dale R, Sjödin A et al. Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat. Methods, 2018; 15: 475–476. doi: 10.1038/s41592-018-0046-7. PubMed DOI PMC
Nome T, van den Beek M, Bernt M et al. Use Apptainer containers for running Galaxy jobs (Galaxy Training Materials). 2024; https://training.galaxyproject.org/training-material/topics/admin/tutorials/apptainer/tutorial.html (Online; Accessed 24 July 2024).
Grüning B, Chilton J, Köster J et al. Practical computational reproducibility in the life sciences. Cell Syst., 2018; 6(6): 631–635. doi: 10.1016/j.cels.2018.03.014. PubMed DOI PMC
de Visser C, Johansson LF, Kulkarni P et al. Ten quick tips for building FAIR workflows. PLOS Comput. Biol., 2023; 19(9): e1011369, doi: 10.1371/journal.pcbi.1011369. PubMed DOI PMC
Soiland-Reyes S, Sefton P, Crosas M et al. Packaging research artefacts with RO-Crate. Data Sci., 2022; 5(2): 97–138. doi: 10.3233/DS-210053. DOI
RECETOX . Galaxy Histories with in silico mass spectra of Mirex, Ethylene, Benzophenone and Enilconazole predicted via QCxMS. Zenodo. 2024; 10.5281/zenodo.12806459. DOI
Sloggett C, Goonasekera N, Afgan E. . BioBlend: automating pipeline analyses within Galaxy and CloudMan. Bioinformatics, 2013; 29(13): 1685–1686. doi: 10.1093/bioinformatics/btt199. PubMed DOI PMC
Ahmad Z, Hecht H, Rojas W. . End-to-end EI+ mass spectra prediction workflow using QCxMS. WorkflowHub. 2025; 10.48546/WORKFLOWHUB.WORKFLOW.897.3. DOI
Goble C, Soiland-Reyes S, Bacall F et al. Implementing FAIR Digital Objects in the EOSC-Life Workflow Collaboratory. Zenodo. 2021; 10.5281/zenodo.4605654. DOI
Jakiela J, Hecht H. . Predicting EI+ mass spectra with QCxMS (Galaxy Training Materials). 2024; https://training.galaxyproject.org/training-material/topics/metabolomics/tutorials/qcxms-predictions/tutorial.html (Online; Accessed 2 October 2024).
Garcia L, Batut B, Burke ML et al. Ten simple rules for making training materials FAIR. PLOS Comput. Biol., 2020; 16(5): e1007854. doi: 10.1371/journal.pcbi.1007854. PubMed DOI PMC
Bauer CA, Grimme S. . Elucidation of electron ionization induced fragmentations of adenine by semiempirical and density functional molecular dynamics. J. Phys. Chem. A, 2014; 118: 11479–11484. doi: 10.1021/jp5096618. PubMed DOI
Wang S, Kind T, Tantillo DJ et al. Predicting in silico electron ionization mass spectra using quantum chemistry. J. Cheminform., 2020; 12: 63. doi: 10.1186/s13321-020-00470-3. PubMed DOI PMC
Wang S, Kind T, Bremer PL et al. Beyond the ground state: predicting electron ionization mass spectra using excited-state molecular dynamics. J. Chem. Inf. Model., 2022; 62: 4403–4410. doi: 10.1021/acs.jcim.2c00597. PubMed DOI PMC
Lee J, Kind T, Tantillo DJ et al. Evaluating the accuracy of the QCEIMS approach for computational prediction of electron ionization mass spectra of purines and pyrimidines. Metabolites, 2022; 12(1): 68. doi: 10.3390/metabo12010068. PubMed DOI PMC
Ahmad Z, Hecht H, Rojas W. . End-to-end spectra predictions: multi atoms dataset. WorkflowHub. 2024; 10.48546/WORKFLOWHUB.WORKFLOW.897.1. DOI
Hecht H, Troják M, Čech M et al. Galaxy tools for Untargeted Mass Spectrometry Analysis. 2025; [Computer software]. Software Heritage https://archive.softwareheritage.org/browse/snapshot/185690f3e21f005fd085a1bf9400627de8c84b59/directory/?origin_url=https://github.com/RECETOX/galaxytools.