Reproducible mass spectrometry data processing and compound annotation in MZmine 3

. 2024 Sep ; 19 (9) : 2597-2641. [epub] 20240520

Jazyk angličtina Země Anglie, Velká Británie Médium print-electronic

Typ dokumentu časopisecké články, přehledy

Perzistentní odkaz   https://www.medvik.cz/link/pmid38769143

Grantová podpora
R01DK136117 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
U01CA235507 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
891397 EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
R03 CA222450 NCI NIH HHS - United States
R03OD034493 U.S. Department of Health & Human Services | National Institutes of Health (NIH)
2152526 National Science Foundation (NSF)

Odkazy

PubMed 38769143
DOI 10.1038/s41596-024-00996-y
PII: 10.1038/s41596-024-00996-y
Knihovny.cz E-zdroje

Untargeted mass spectrometry (MS) experiments produce complex, multidimensional data that are practically impossible to investigate manually. For this reason, computational pipelines are needed to extract relevant information from raw spectral data and convert it into a more comprehensible format. Depending on the sample type and/or goal of the study, a variety of MS platforms can be used for such analysis. MZmine is an open-source software for the processing of raw spectral data generated by different MS platforms. Examples include liquid chromatography-MS, gas chromatography-MS and MS-imaging. These data might typically be associated with various applications including metabolomics and lipidomics. Moreover, the third version of the software, described herein, supports the processing of ion mobility spectrometry (IMS) data. The present protocol provides three distinct procedures to perform feature detection and annotation of untargeted MS data produced by different instrumental setups: liquid chromatography-(IMS-)MS, gas chromatography-MS and (IMS-)MS imaging. For training purposes, example datasets are provided together with configuration batch files (i.e., list of processing steps and parameters) to allow new users to easily replicate the described workflows. Depending on the number of data files and available computing resources, we anticipate this to take between 2 and 24 h for new MZmine users and nonexperts. Within each procedure, we provide a detailed description for all processing parameters together with instructions/recommendations for their optimization. The main generated outputs are represented by aligned feature tables and fragmentation spectra lists that can be used by other third-party tools for further downstream analysis.

Zobrazit více v PubMed

Alseekh, S. et al. Mass spectrometry-based metabolomics: a guide for annotation, quantification and best reporting practices. Nat. Methods 18, 747–756 (2021). PubMed DOI PMC

Da Silva, R. R., Dorrestein, P. C. & Quinn, R. A. Illuminating the dark matter in metabolomics. Proc. Natl Acad. Sci. USA 112, 12549–12550 (2015). PubMed DOI PMC

Müller, C., Binder, U., Bracher, F. & Giera, M. Antifungal drug testing by combining minimal inhibitory concentration testing with target identification by gas chromatography–mass spectrometry. Nat. Protoc. 12, 947–963 (2017). PubMed DOI

Lisec, J., Schauer, N., Kopka, J., Willmitzer, L. & Fernie, A. R. Gas chromatography mass spectrometry-based metabolite profiling in plants. Nat. Protoc. 1, 387–396 (2006). PubMed DOI

Chan, E. C. Y., Pasikanti, K. K. & Nicholson, J. K. Global urinary metabolic profiling procedures using gas chromatography–mass spectrometry. Nat. Protoc. 6, 1483–1499 (2011). PubMed DOI

Goodacre, R., Vaidyanathan, S., Dunn, W. B., Harrigan, G. G. & Kell, D. B. Metabolomics by numbers: acquiring and understanding global metabolite data. Trends Biotechnol. 22, 245–252 (2004). PubMed DOI

Aksenov, A. A., da Silva, R., Knight, R., Lopes, N. P. & Dorrestein, P. C. Global chemical analysis of biology by mass spectrometry. Nat. Rev. Chem. 1, 1–20 (2017). DOI

Kompauer, M., Heiles, S. & Spengler, B. Atmospheric pressure MALDI mass spectrometry imaging of tissues and cells at 1.4-μm lateral resolution. Nat. Methods 14, 90–96 (2017). PubMed DOI

Schmid, R. et al. Integrative analysis of multimodal mass spectrometry data in MZmine 3. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01690-2 (2023). PubMed DOI PMC

Paglia, G., Smith, A. J. & Astarita, G. Ion mobility mass spectrometry in the omics era: challenges and opportunities for metabolomics and lipidomics. Mass Spectrom. Rev. https://doi.org/10.1002/mas.21686 (2021). PubMed DOI

Vasilopoulou, C. G. et al. Trapped ion mobility spectrometry and PASEF enable in-depth lipidomics from minimal sample amounts. Nat. Commun. 11, 331 (2020). PubMed DOI PMC

Chang, H.-Y. et al. A practical guide to metabolomics software development. Anal. Chem. 93, 1912–1923 (2021). PubMed DOI PMC

Tautenhahn, R., Böttcher, C. & Neumann, S. Highly sensitive feature detection for high resolution LC/MS. BMC Bioinforma. 9, 504 (2008). DOI

De Vijlder, T. et al. A tutorial in small molecule identification via electrospray ionization-mass spectrometry: the practical art of structural elucidation. Mass Spectrom. Rev. 37, 607–629 (2018). PubMed DOI

Korf, A., Jeck, V., Schmid, R., Helmer, P. O. & Hayen, H. Lipid species annotation at double bond position level with custom databases by extension of the MZmine 2 open-source software package. Anal. Chem. 91, 5098–5105 (2019). PubMed DOI

Dührkop, K. et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16, 299–302 (2019). PubMed DOI

Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016). PubMed DOI PMC

Pluskal, T. et al. in Processing Metabolomics and Proteomics Data with Open Software 232–254 (Royal Society of Chemistry, 2020).

Hammann, S., Korf, A., Bull, I. D., Hayen, H. & Cramp, L. J. E. Lipid profiling and analytical discrimination of seven cereals using high temperature gas chromatography coupled to high resolution quadrupole time-of-flight mass spectrometry. Food Chem. 282, 27–35 (2019). PubMed DOI

Simon, C. et al. Mass difference matching unfolds hidden molecular structures of dissolved organic matter. Environ. Sci. Technol. 56, 11027–11040 (2022). PubMed DOI PMC

Korf, A. et al. Digging deeper—a new data mining workflow for improved processing and interpretation of high resolution GC–Q-TOF MS data in archaeological research. Sci. Rep. 10, 767 (2020). PubMed DOI PMC

Brungs, C. et al. Tattoo pigment identification in inks and skin biopsies of adverse reactions by complementary elemental and molecular bioimaging with mass spectral library matching. Anal. Chem. 94, 3581–3589 (2022). PubMed DOI

Wolf, C. et al. Mobility-resolved broadband dissociation and parallel reaction monitoring for laser desorption/ionization–mass spectrometry—tattoo pigment identification supported by trapped ion mobility spectrometry. Anal. Chim. Acta 1242, 340796 (2023). PubMed DOI

Deutsch, E. W. Mass spectrometer output file format mzML. Methods Mol. Biol. 604, 319–331 (2010). PubMed DOI PMC

Pedrioli, P. G. A. et al. A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol. 22, 1459–1466 (2004). PubMed DOI

Römpp, A. et al. imzML: imaging mass spectrometry markup language: a common data format for mass spectrometry imaging. Methods Mol. Biol. 696, 205–224 (2011). PubMed DOI

Rew, R. & Davis, G. NetCDF: an interface for scientific data access. IEEE Comput. Graph. Appl. 10, 76–82 (1990). DOI

Lu, M., An, S., Wang, R., Wang, J. & Yu, C. Aird: a computation-oriented mass spectrometry data format enables a higher compression ratio and less decoding time. BMC Bioinforma. 23, 35 (2022). DOI

Adusumilli, R. & Mallick, P. Data conversion with ProteoWizard msConvert. Methods Mol. Biol. 1550, 339–368 (2017). PubMed DOI

Chambers, M. C. et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 30, 918–920 (2012). PubMed DOI PMC

Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R. & Siuzdak, G. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78, 779–787 (2006). PubMed DOI

Röst, H. L. et al. OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat. Methods 13, 741–748 (2016). PubMed DOI

Tsugawa, H. et al. MS-DIAL: data-independent MS/MS deconvolution for comprehensive metabolome analysis. Nat. Methods 12, 523–526 (2015). PubMed DOI PMC

Kirkwood, K. I. et al. Utilizing Skyline to analyze lipidomics data containing liquid chromatography, ion mobility spectrometry and mass spectrometry dimensions. Nat. Protoc. 17, 2415–2430 (2022). PubMed DOI PMC

Wilkinson, M. D. et al. The FAIR guiding principles for scientific data management and stewardship. Sci. Data 3, 160018 (2016). PubMed DOI PMC

Barker, M. et al. Introducing the FAIR principles for research software. Sci. Data 9, 622 (2022). PubMed DOI PMC

Haug, K. et al. MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Res 41, D781–D786 (2013). PubMed DOI

Sud, M. et al. Metabolomics Workbench: an international repository for metabolomics data and metadata, metabolite standards, protocols, tutorials and training, and analysis tools. Nucleic Acids Res 44, D463–D470 (2016). PubMed DOI

Meier, F. et al. Online parallel accumulation–serial fragmentation (PASEF) with a novel trapped ion mobility mass spectrometer. Mol. Cell. Proteom. 17, 2534–2545 (2018). DOI

Whittemore, J. C., Stokes, J. E., Laia, N. L., Price, J. M. & Suchodolski, J. S. Short and long-term effects of a synbiotic on clinical signs, the fecal microbiome, and metabolomic profiles in healthy research cats receiving clindamycin: a randomized, controlled trial. PeerJ 6, e5130 (2018). PubMed DOI PMC

Matyash, V., Liebisch, G., Kurzchalia, T. V., Shevchenko, A. & Schwudke, D. Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics. J. Lipid Res. 49, 1137–1146 (2008). PubMed DOI PMC

Chaleckis, R., Murakami, I., Takada, J., Kondoh, H. & Yanagida, M. Individual variability in human blood metabolites identifies age-related differences. Proc. Natl Acad. Sci. USA 113, 4252–4259 (2016). PubMed DOI PMC

Smith, R., Ventura, D. & Prince, J. T. LC–MS alignment in theory and practice: a comprehensive algorithmic review. Brief. Bioinform. 16, 104–117 (2015). PubMed DOI

Pluskal, T., Uehara, T. & Yanagida, M. Highly accurate chemical formula prediction tool utilizing high-resolution mass spectra, MS/MS fragmentation, heuristic rules, and isotope pattern matching. Anal. Chem. 84, 4396–4403 (2012). PubMed DOI

Renai, L. et al. Combining feature-based molecular networking and contextual mass spectral libraries to decipher nutrimetabolomics profiles. Metabolites 12, 1005 (2022). PubMed DOI PMC

Bazsó, F. L. et al. Quantitative comparison of tandem mass spectra obtained on various instruments. J. Am. Soc. Mass Spectrom. 27, 1357–1365 (2016). PubMed DOI

Nothias, L. F. et al. Feature-based molecular networking in the GNPS analysis environment. Nat. Methods 17, 905–908 (2020). PubMed DOI PMC

Olivon, F. et al. MetGem software for the generation of molecular networks based on the t-SNE algorithm. Anal. Chem. 90, 13900–13908 (2018). PubMed DOI

Elie, N., Santerre, C. & Touboul, D. Generation of a molecular network from electron ionization mass spectrometry data by combining MZmine2 and MetGem software. Anal. Chem. 91, 11489–11492 (2019). PubMed DOI

Zdouc, M. M. et al. FERMO: a dashboard for streamlined rationalized prioritization of molecular features from mass spectrometry data. Preprint at bioRxiv https://doi.org/10.1101/2022.12.21.521422 (2022).

Schmid, R. et al. Ion identity molecular networking for mass spectrometry-based metabolomics in the GNPS environment. Nat. Commun. 12, 3832 (2021). PubMed DOI PMC

Pakkir Shah, A. K. The hitchhiker’s guide to statistical analysis of feature-based molecular networks from non-targeted metabolomics data. Preprint at ChemRxiv https://doi.org/10.26434/chemrxiv-2023-wwbt0 (2023).

Pang, Z. et al. MetaboAnalyst 5.0: narrowing the gap between raw spectra and functional insights. Nucleic Acids Res 49, W388–W396 (2021). PubMed DOI PMC

Pang, Z. et al. Using MetaboAnalyst 5.0 for LC–HRMS spectra processing, multi-omics integration and covariate adjustment of global metabolomics data. Nat. Protoc. 17, 1735–1761 (2022). PubMed DOI

Myers, O. D., Sumner, S. J., Li, S., Barnes, S. & Du, X. One step forward for reducing false positive and false negative compound identifications from mass spectrometry metabolomics data: new algorithms for constructing extracted ion chromatograms and detecting chromatographic peaks. Anal. Chem. 89, 8696–8703 (2017). PubMed DOI

Du, X., Smirnov, A., Pluskal, T., Jia, W. & Sumner, S. in Computational Methods and Data Analysis for Metabolomics (ed. Li, S.) 25–48 (Springer, 2020).

Smirnov, A. et al. adap-gc 4.0: application of clustering-assisted multivariate curve resolution to spectral deconvolution of gas chromatography–mass spectrometry metabolomics data. Anal. Chem. 91, 9069–9077 (2019). PubMed DOI PMC

Katajamaa, M., Miettinen, J. & Oresic, M. MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22, 634–636 (2006). PubMed DOI

Lerno, L. A. Jr, German, J. B. & Lebrilla, C. B. Method for the identification of lipid classes based on referenced Kendrick mass analysis. Anal. Chem. 82, 4236–4245 (2010). PubMed DOI PMC

Sleno, L. The use of mass defect in modern mass spectrometry. J. Mass Spectrom. 47, 226–236 (2012). PubMed DOI

Helmer, P. O., Korf, A. & Hayen, H. Analysis of artificially oxidized cardiolipins and monolyso-cardiolipins via liquid chromatography/high-resolution mass spectrometry and Kendrick mass defect plots after hydrophilic interaction liquid chromatography based sample preparation. Rapid Commun. Mass Spectrom. 34, e8566 (2020). PubMed DOI

Müller, W. H. et al. Dual-polarity SALDI FT–ICR MS imaging and Kendrick mass defect data filtering for lipid analysis. Anal. Bioanal. Chem. 413, 2821–2830 (2021). PubMed DOI

Korf, A. et al. Three-dimensional Kendrick mass plots as a tool for graphical lipid identification. Rapid Commun. Mass Spectrom. 32, 981–991 (2018). PubMed DOI

Korf, A., Fouquet, T., Schmid, R., Hayen, H. & Hagenhoff, S. Expanding the Kendrick mass plot toolbox in MZmine 2 to enable rapid polymer characterization in liquid chromatography−mass spectrometry data sets. Anal. Chem. 92, 628–633 (2020). PubMed DOI

Aron, A. T. et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat. Protoc. 15, 1954–1991 (2020). PubMed DOI

Beniddir, M. A. et al. Advances in decomposing complex metabolite mixtures using substructure- and network-based computational metabolomics approaches. Nat. Prod. Rep. 38, 1967–1993 (2021). PubMed DOI PMC

Dührkop, K. et al. Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nat. Biotechnol. https://doi.org/10.1038/s41587-020-0740-8 (2020). PubMed DOI

Wang, M. et al. Mass spectrometry searches using MASST. Nat. Biotechnol. 38, 23–26 (2020). PubMed DOI PMC

da Silva, R. R. et al. Propagating annotations of molecular networks using in silico fragmentation. PLoS Comput. Biol. 14, e1006089 (2018). PubMed DOI PMC

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...