• This record comes from PubMed

A universal language for finding mass spectrometry data patterns

. 2025 Jun ; 22 (6) : 1247-1254. [epub] 20250512

Language English Country United States Media print-electronic

Document type Journal Article

Links

PubMed 40355727
DOI 10.1038/s41592-025-02660-z
PII: 10.1038/s41592-025-02660-z
Knihovny.cz E-resources

Despite being information rich, the vast majority of untargeted mass spectrometry data are underutilized; most analytes are not used for downstream interpretation or reanalysis after publication. The inability to dive into these rich raw mass spectrometry datasets is due to the limited flexibility and scalability of existing software tools. Here we introduce a new language, the Mass Spectrometry Query Language (MassQL), and an accompanying software ecosystem that addresses these issues by enabling the community to directly query mass spectrometry data with an expressive set of user-defined mass spectrometry patterns. Illustrated by real-world examples, MassQL provides a data-driven definition of chemical diversity by enabling the reanalysis of all public untargeted metabolomics data, empowering scientists across many disciplines to make new discoveries. MassQL has been widely implemented in multiple open-source and commercial mass spectrometry analysis tools, which enhances the ability, interoperability and reproducibility of mining of mass spectrometry data for the research community.

Applied Bioinformatics Department of Computer Science University of Tuebingen University of Tuebingen Tübingen Germany

Applied Bioinformatics Department of Computer Science University of Tuebingen University of Tuebingen; Institute for Bioinformatics and Medical Informatics University of Tuebingen; Institute for Translational Bioinformatics University Hospital Tuebingen Tübingen Germany

Bioinformatics Group Wageningen University and Research Wageningen the Netherlands

Biologicals and Natural Products Crop Protection R and D Corteva Agrisciences Indianapolis IN USA

Biologicals and Natural Products Discovery Crop Protection R and D Corteva Agrisciences Indianapolis IN USA

BioMolecular Sciences School of Pharmacy University of Mississippi Oxford MS USA

Center for Urban Waters University of Washington Tacoma WA USA

Chemistry and Chemical Biology Northeastern University Boston MA USA

Clinical Biomarkers Laboratory School of Medicine Emory University Atlanta GA USA

Collaborative Mass Spectrometry Innovation Center Skaggs School of Pharmacy and Pharmaceutical Sciences University of California San Diego La Jolla CA USA

College of Pharmacy and Integrated Research Institute for Drug Development Dongguk University Seoul Goyang Republic of Korea

College of Pharmacy Sookmyung Women's University Seoul Republic of Korea

College of Pharmacy University of Rhode Island Kingston RI USA

Crop Protection R and D Corteva Agrisciences Indianapolis IN USA

Data Science and Bioinformatics Corteva Agrisciences Dublin OH USA

Department of Biochemistry University of California Riverside Riverside CA USA

Department of Biochemistry University of Johannesburg Johannesburg South Africa

Department of Bioengineering University of California San Diego La Jolla CA USA

Department of BioMolecular Sciences School of Pharmacy University of Mississippi Oxford MS USA

Department of Biotechnology and Biomedicine Technical University of Denmark Kongens Lyngby Denmark

Department of Biotechnology and Life Science Tokyo University of Agriculture and Technology Koganei Japan

Department of Chemistry and Biochemistry San Diego State University San Diego CA USA

Department of Chemistry and Biochemistry UC Santa Cruz Santa Cruz CA USA

Department of Chemistry and Biochemistry University of Arizona Tucson AZ USA

Department of Chemistry and Biochemistry University of Denver Denver CO USA

Department of Chemistry BMC Science for Life Laboratory Uppsala University Uppsala Sweden

Department of Chemistry Case Western Reserve University Cleveland OH USA

Department of Computer Science University of California Riverside Riverside CA USA

Department of Fundamental Chemistry Institute of Chemistry University of São Paulo São Paulo Brazil

Department of Marine Biology The Leon H Charney School of Marine Sciences University of Haifa Haifa Israel

Department of Medicinal Chemistry College of Pharmacy University of Michigan Ann Arbor MI USA

Department of Pharmaceutical Sciences Skaggs School of Pharmacy and Pharmaceutical Sciences University of Colorado Anschutz Medical Campus Aurora CO USA

Department of Pharmacy University of Marburg Marburg Germany

Environmental Genomics and Systems Biology Division Lawrence Berkeley National Lab Berkeley CA USA

Faculty of Chemistry Institute of Exact and Natural Science Federal University of Para Belem Brazil

Functional Metabolomics Lab CMFI Cluster of Excellence University of Tuebingen Tuebingen Germany

Institute for Biomedicine Eurac Research Bolzano Italy

Institute of Inorganic and Analytical Chemistry University of Münster Münster Germany

Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences Prague Czech Republic

Institute of Pharmaceutical Biology Goethe University Frankfurt Frankfurt Germany

Institute of Pharmaceutical Biology University of Bonn Bonn Germany

Institute of Pharmacy Freie Universität Berlin Berlin Germany

Laboratory of Physical and Chemical Methods of Research Center for Advanced Technologies Tashkent Uzbekistan

Mass Spectrometry Center of Expertise Regulatory and Stewardship Corteva Agrisciences Indianapolis IN USA

Metabolomics Core Facility Immunity Inflammation and Disease Laboratory Division of Intramural Research National Institute of Environmental Health Sciences National Institutes of Health Research Triangle Park NC USA

Natural Products Discovery Core Life Sciences Institute University of Michigan Ann Arbor MI USA

Pharmacognosy Department Faculty of Pharmacy Cairo University Cairo Egypt

Pharmacognosy Faculty of Pharmacy Al Azhar University Nasr City Egypt

RIKEN Center for Integrative Medical Sciences Tsurumi ku Japan

RIKEN Center for Sustainable Resource Science Tsurumi ku Japan

School of Chemistry and Biochemistry Center for Microbial Dynamics and Infection Georgia Institute of Technology Atlanta GA USA

School of Chemistry and Biochemistry Georgia Institute of Technology Atlanta GA USA

Scripps Institution of Oceanography and Skaggs School of Pharmacy and Pharmaceutical Sciences University of California San Diego La Jolla CA USA

SW R and D Bioinformatics Life Science Mass Spectrometry Bruker Daltonics GmbH and Co KG Bremen Germany

The Joint Genome Institute Lawrence Berkeley National Lab Berkeley CA USA

The Novo Nordisk Foundation Center for Biosustainability Technical University of Denmark Kongens Lyngby Denmark

Walter Mors Institute of Research on Natural Products Federal University of Rio de Janeiro Rio de Janeiro Brazil

West Coast Metabolomics Center University of California Davis Davis CA USA

See more in PubMed

Stein, S. E. & Scott, D. R. Optimization and testing of mass spectral library search algorithms for compound identification. J. Am. Soc. Mass. Spectrom. 5, 859–866 (1994). PubMed DOI

Baars, O., Morel, F. M. M. & Perlman, D. H. ChelomEx: isotope-assisted discovery of metal chelates in complex media using high-resolution LC–MS. Anal. Chem. 86, 11298–11305 (2014). PubMed DOI

Huber, F. et al. matchms—processing and similarity evaluation of mass spectrometry data. J. Open Source Softw. 5, 2411 (2020). DOI

Chang, H.-Y. et al. A practical guide to metabolomics software development. Anal. Chem. 93, 1912–1923 (2021). PubMed DOI PMC

Matsuda, F. Regular expressions of MS/MS spectra for partial annotation of metabolite features. Metabolomics 12, 113 (2016). DOI

Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016). PubMed DOI PMC

Sud, M. et al. Metabolomics Workbench: an international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res. 44, D463–D470 (2016). PubMed DOI

Haug, K. et al. MetaboLights: a resource evolving in response to the needs of its scientific community. Nucleic Acids Res. 48, D440–D444 (2020). PubMed

Petras, D. et al. GNPS Dashboard: collaborative exploration of mass spectrometry data in the web browser. Nat. Methods 19, 134–136 (2022). PubMed DOI PMC

Schmid, R. et al. Integrative analysis of multimodal mass spectrometry data in MZmine 3. Nat. Biotechnol. 41, 447–449 (2023). PubMed DOI PMC

Pfeuffer, J. et al. OpenMS 3 enables reproducible analysis of large-scale mass spectrometry data. Nat. Methods 21, 365–367 (2024). PubMed DOI

Tsugawa, H. et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12, 523–526 (2015). PubMed DOI PMC

Kostelic, M. M., & Marty, M. T. Deconvolving native and intact protein mass spectra with UniDec. Methods Mol. Biol. https://doi.org/10.1007/978-1-0716-2325-1_12 (2022).

Rainer, J. et al. A modular and expandable ecosystem for metabolomics data annotation in R. Metabolites 12, 173 (2022). PubMed DOI PMC

Hider, R. C. & Kong, X. Chemistry and biology of siderophores. Nat. Prod. Rep. 27, 637–657 (2010). PubMed DOI

Sandy, M. & Butler, A. Microbial iron acquisition: marine and terrestrial siderophores. Chem. Rev. 109, 4580–4595 (2009). PubMed DOI PMC

Aron, A. T. et al. Native mass spectrometry-based metabolomics identifies metal-binding compounds. Nat. Chem. 14, 100–109 (2022). PubMed DOI

Schmid, R. et al. Ion identity molecular networking for mass spectrometry-based metabolomics in the GNPS environment. Nat. Commun. 12, 3832 (2021). PubMed DOI PMC

Frank, A. M. et al. Clustering millions of tandem mass spectra. J. Proteome Res. 7, 113–122 (2008). PubMed DOI

Cruz-Huerta, E. et al. Short communication: identification of iron-binding peptides from whey protein hydrolysates using iron (III)-immobilized metal ion affinity chromatography and reversed phase-HPLC-tandem mass spectrometry. J. Dairy Sci. 99, 77–82 (2016). PubMed DOI

Nalini, S. & Balasubramanian, K. A. Studies on iron binding by free fatty acids. Indian J. Biochem. Biophys. 30, 224–228 (1993). PubMed

Sanyal, A. J., Hirsch, J. I. & Moore, E. W. Premicellar taurocholate avidly binds ferrous (Fe PubMed

Tamilmani, P. & Pandey, M. C. Iron binding efficiency of polyphenols: comparison of effect of ascorbic acid and ethylenediaminetetraacetic acid on catechol and galloyl groups. Food Chem. 197, 1275–1279 (2016). PubMed DOI

Reemtsma, T., Quintana, J. B., Rodil, R., García-López, M. & Rodríguez, I. Organophosphorus flame retardants and plasticizers in water and air I. Occurrence and fate. Trends Anal. Chem. 27, 727–737 (2008). DOI

van der Veen, I. & de Boer, J. Phosphorus flame retardants: properties, production, environmental occurrence, toxicity and analysis. Chemosphere 88, 1119–1153 (2012). PubMed DOI

Yao, C., Yang, H. & Li, Y. A review on organophosphate flame retardants in the environment: occurrence, accumulation, metabolism and toxicity. Sci. Total Environ. 795, 148837 (2021). PubMed DOI

Meng, W. et al. Functional group-dependent screening of organophosphate esters (OPEs) and discovery of an abundant OPE bis-(2-ethylhexyl)-phenyl phosphate in indoor dust. Environ. Sci. Technol. 54, 4455–4464 (2020). PubMed DOI

Wang, L., Jia, Y. & Hu, J. Nine alkyl organophosphate triesters newly identified in house dust. Environ. Int. 165, 107333 (2022). PubMed DOI

Ye, L., Meng, W., Huang, J., Li, J. & Su, G. Establishment of a target, suspect, and functional group-dependent screening strategy for organophosphate esters (OPEs): “into the unknown” of OPEs in the sediment of Taihu Lake, China. Environ. Sci. Technol. 55, 5836–5847 (2021). PubMed DOI

Bittremieux, W., Laukens, K., Noble, W. S. & Dorrestein, P. C. Large-scale tandem mass spectrum clustering using fast nearest neighbor searching. Rapid Commun. Mass Spectrom. https://doi.org/10.1002/rcm.9153 (2021).

Mohanty, I. et al. The underappreciated diversity of bile acid modifications. Cell 187, 1801–1818 (2024). PubMed DOI

El Abiead, Y. et al. Heterogeneous multimeric metabolite ion species observed in LC–MS based metabolomics data sets. Anal. Chim. Acta 1229, 340352 (2022). PubMed DOI

Oesterle, I. et al. Exposomic biomonitoring of polyphenols by non-targeted analysis and suspect screening. Anal. Chem. 95, 10686–10694 (2023). PubMed DOI PMC

Liu, Z. et al. Localized cardiac small molecule trajectories and persistent chemical sequelae in experimental Chagas disease. Nat. Commun. 14, 6769 (2023). PubMed DOI PMC

Ahmed, M. M. A., Tripathi, S. K. & Boudreau, P. D. Comparative metabolomic profiling of Cupriavidus necator B-4383 revealed production of cupriachelin siderophores, one with activity against Cryptococcus neoformans. Front. Chem. 11, 1256962 (2023). PubMed DOI PMC

Ahmed, M. M. A. & Boudreau, P. D. LCMS-metabolomic profiling and genome mining of Delftia lacustris DSM 21246 revealed lipophilic delftibactin metallophores. J. Nat. Prod. 87, 1384–1393 (2024). PubMed DOI PMC

Allard, P.-M. et al. Open and reusable annotated mass spectrometry dataset of a chemodiverse collection of 1,600 plant extracts. GigaScience 12, giac124 (2023). DOI PMC

Berger, T. et al. A MassQL-integrated molecular networking approach for the discovery and substructure annotation of bioactive. Cycl. Pept. J. Nat. Prod. 87, 692–704 (2024). DOI

Gaudry, A. et al. A sample-centric and knowledge-driven computational framework for natural products drug discovery. ACS Cent. Sci. 10, 494–510 (2024). PubMed DOI PMC

Leão, T. F. et al. NPOmix: a machine learning classifier to connect mass spectrometry fragmentation data to biosynthetic gene clusters. PNAS Nexus 1, pgac257 (2022). PubMed DOI PMC

Quiros-Guerrero, L.-M. et al. Comprehensive mass spectrometric metabolomic profiling of a chemically diverse collection of plants of the Celastraceae family. Sci. Data 11, 415 (2024). PubMed DOI PMC

Selegato, D. M., Zanatta, A. C., Pilon, A. C., Veloso, J. H. & Castro-Gamboa, I. Application of feature-based molecular networking and MassQL for the MS/MS fragmentation study of depsipeptides. Front. Mol. Biosci. 10, 1238475 (2023). PubMed DOI PMC

Bittremieux, W. et al. Comparison of cosine, modified cosine, and neutral loss based spectrum alignment for discovery of structurally related molecules. J. Am. Soc. Mass. Spectrom. 33, 1733–1744 (2022). PubMed DOI

Wang, M. et al. Mass spectrometry searches using MASST. Nat. Biotechnol. 38, 23–26 (2020). PubMed DOI PMC

Goloborodko, A. A., Levitsky, L. I., Ivanov, M. V. & Gorshkov, M. V. Pyteomics—a Python framework for exploratory data analysis and rapid software prototyping in proteomics. J. Am. Soc. Mass. Spectrom. 24, 301–304 (2013). PubMed DOI

Martens, L. et al. mzML—a community standard for mass spectrometry data. Mol. Cell. Proteom. 10, R110.000133 (2011). DOI

Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017). PubMed DOI

Wang, M. et al. mwang87/MassQueryLanguage: release 2024.12.12. Zenodo https://doi.org/10.5281/zenodo.14419767 (2024).

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...