Abstract
We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography–mass spectrometry (GC–MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC–MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples.
Original language | English |
---|---|
Pages (from-to) | 169-173 |
Number of pages | 5 |
Journal | Nature Biotechnology |
Volume | 39 |
Issue number | 2 |
DOIs | |
State | Published - 1 Feb 2021 |
ASJC Scopus subject areas
- Biotechnology
- Bioengineering
- Biomedical Engineering
- Applied Microbiology and Biotechnology
- Molecular Medicine
Fingerprint
Dive into the research topics of 'Auto-deconvolution and molecular networking of gas chromatography–mass spectrometry data'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver
}
Auto-deconvolution and molecular networking of gas chromatography–mass spectrometry data. / Aksenov, Alexander A.; Laponogov, Ivan; Zhang, Zheng et al.
In: Nature Biotechnology, Vol. 39, No. 2, 01.02.2021, p. 169-173.Research output: Contribution to journal › Article › peer-review
TY - JOUR
T1 - Auto-deconvolution and molecular networking of gas chromatography–mass spectrometry data
AU - Aksenov, Alexander A.
AU - Laponogov, Ivan
AU - Zhang, Zheng
AU - Doran, Sophie L.F.
AU - Belluomo, Ilaria
AU - Veselkov, Dennis
AU - Bittremieux, Wout
AU - Nothias, Louis Felix
AU - Nothias-Esposito, Mélissa
AU - Maloney, Katherine N.
AU - Misra, Biswapriya B.
AU - Melnik, Alexey V.
AU - Smirnov, Aleksandr
AU - Du, Xiuxia
AU - Jones, Kenneth L.
AU - Dorrestein, Kathleen
AU - Panitchpakdi, Morgan
AU - Ernst, Madeleine
AU - van der Hooft, Justin J.J.
AU - Gonzalez, Mabel
AU - Carazzone, Chiara
AU - Amézquita, Adolfo
AU - Callewaert, Chris
AU - Morton, James T.
AU - Quinn, Robert A.
AU - Bouslimani, Amina
AU - Orio, Andrea Albarracín
AU - Petras, Daniel
AU - Smania, Andrea M.
AU - Couvillion, Sneha P.
AU - Burnet, Meagan C.
AU - Nicora, Carrie D.
AU - Zink, Erika
AU - Metz, Thomas O.
AU - Artaev, Viatcheslav
AU - Humston-Fulmer, Elizabeth
AU - Gregor, Rachel
AU - Meijler, Michael M.
AU - Mizrahi, Itzhak
AU - Eyal, Stav
AU - Anderson, Brooke
AU - Dutton, Rachel
AU - Lugan, Raphaël
AU - Boulch, Pauline Le
AU - Guitton, Yann
AU - Prevost, Stephanie
AU - Poirier, Audrey
AU - Dervilly, Gaud
AU - Le Bizec, Bruno
AU - Fait, Aaron
AU - Persi, Noga Sikron
AU - Song, Chao
AU - Gashu, Kelem
AU - Coras, Roxana
AU - Guma, Monica
AU - Manasson, Julia
AU - Scher, Jose U.
AU - Barupal, Dinesh Kumar
AU - Alseekh, Saleh
AU - Fernie, Alisdair R.
AU - Mirnezami, Reza
AU - Vasiliou, Vasilis
AU - Schmid, Robin
AU - Borisov, Roman S.
AU - Kulikova, Larisa N.
AU - Knight, Rob
AU - Wang, Mingxun
AU - Hanna, George B.
AU - Dorrestein, Pieter C.
AU - Veselkov, Kirill
N1 - Funding Information: The conversion of the data from different repositories was supported by grant R03 CA211211 on reuse of metabolomics data, to build enabling chemical analysis tools for the ocean symbiosis program, and the development of a user-friendly interface for GC–MS analysis was supported by the Gordon and Betty Moore Foundation through grant GBMF7622. The University of California, San Diego Center for Microbiome Innovation supported the campus-wide seed grant awards for data collection that enabled the development of some of this infrastructure. P.C.D. was supported by the National Science Foundation (grant no. IOS-1656475) and the National Institutes of Health (NIH) (grant nos. U19 AG063744 01, P41 GM103484, R03 CA211211 and R01 GM107550). K.V. and I.L. are very grateful for the support of the Vodafone Foundation as part of the DRUGS/DreamLab project. The MSHub platform development was supported by NIH/NIAAA grant (R21 AA028432) on integrated machine learning for mass spectrometry data in liver disease, Intelligify Limited and Vodafone Foundation’s DRUGS/CORONA-AI projects on network machine learning for drug repositioning and discovery of hyperfoods with antiviral/anticancer molecules. M.E. was supported by the University of Corsica. L.F.N. was supported by the NIH (R01 GM107550) and the European Union’s Horizon 2020 Research and Innovation Programme (MSCA-GF, 704786). A.B. was supported by the National Institute of Justice Award (2015-DN-BX-K047). Additional support for data acquisition and data storage was provided by the Center for Computational Mass Spectrometry (P41 GM103484). The collection of data from the HomeChem Project was supported by the Sloan Foundation. G.B.H., S.L.F.D., I.L., K.V. and I.B. are grateful for the support of the OG cancer breath analysis study by the National Institute for Health Research London Invitro Diagnostic Co-operative and the NIHR Imperial Biomedical Research Centre, the Rosetrees and Stonegate Trusts and the Imperial College Charity. D.V. acknowledges support from ERC-Consolidator grant 724228 (LEMAN). I.B. acknowledges the contribution of Q. Wen and M. Colavita in the production of the training video. C. Callewaert was supported by the Research Foundation Flanders, with support from the industrial research fund of Ghent University. W.B. was supported by the Research Foundation Flanders. A.A.O. acknowledges the support of the Fulbright Commission and Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET-Argentina). The work of R.L. and P.L.B. on the data set 30 was supported by Metaboscope, part of the ‘Platform 3A’, funded by the European Regional Development Fund, the French Ministry of Research, Higher Education and Innovation, the Provence-Alpes-Côte d’Azur region, the Departmental Council of Vaucluse and the Urban Community of Avignon. S.A. and A.R.F. acknowledge the PlantaSYST project by the European Union’s Horizon 2020 Research and Innovation Programme (SGA-CSA nos. 664621 and 739582 under FPA no. 664620). V.V. acknowledges support from the National Institute on Alcohol Abuse and Alcoholism award R24AA022057. M. Guma and R.C. acknowledge the support of the Krupp Endowed Fund grant. A portion of mass spectra in the public reference library was produced within the framework of the State Task for the Topchiev Institute of Petrochemical Synthesis RAS and with the support of the RUDN University Program 5-100. R.S.B. acknowledges support of the State Task for the Topchiev Institute of Petrochemical Synthesis RAS. L.N.K. acknowledges support of the RUDN University Program 5-100. I.M. acknowledges support of the Israel Science Foundation (project no. 1947/19) and European Research Council under the European Union’s Horizon 2020 Research and Innovation Programme (project no. 640384). J.S. has been supported by NIH/NIAMS R03AR072182, the Colton Center for Autoimmunity, the Rheumatology Research Foundation, the Riley Family Foundation and the Snyder Family Foundation. J. Manasson acknowledges support from the 2017 Group for Research and Assessment of Psoriasis and Psoriatic Arthritis Pilot Research Grant and NIH/NIAMS T32AR069515. R.G. is grateful to the Azrieli Foundation for the award of an Azrieli Fellowship. J.J.J.v.d.H. acknowledges support from an ASDI eScience grant (ASDI.2017.030) from the Netherlands eScience Center-NLeSC. B.A. was supported by the National Science Foundation through the Graduate Research Fellowship Program. GC–MS analyses for collection of the MSV000083743 data set were supported by the Pacific Northwest National Laboratory, Laboratory-Directed Research and Development Program, and were contributed by the Microbiomes in Transition Initiative; data were collected in the Environmental Molecular Sciences Laboratory, a national scientific user facility sponsored by the Department of Energy (DOE) Office of Biological and Environmental Research and located at the Pacific Northwest National Laboratory (PNNL). PNNL is operated by the Battelle Memorial Institute for the DOE under contract DEAC05-76RLO1830. M. Guma and R.C. acknowledge the support of the Krupp Endowed Fund grant. R.C. was also funded by T32AR064194-07. The authors are grateful to R. da Silva for his contribution to developing the first prototype of the EI data network and his continuous assistance with further development and testing of the infrastructure. The authors are also grateful to M. Vance and D. Farmer, who organized the sampling for HomeChem Indoor Chemistry Project (https://indoorchem.org/projects/homechem/) that allowed the collection of samples for the MSV000083598 data set. B. Ross has assisted with collecting data for the MSV000084348 data set. GC–MS analyses for collection of the MSV000084211 and MSV000084212 data sets were supported by N757 Doctorados Nacionales and project EXT-2016-69-1713 from the Departamento Administrativo de Ciencia, Tecnología e Innovación (COLCIENCIAS), the seed project INV-2019-67-1747 and the FAPA project of Chiara Carazzone from the Faculty of Science at Universidad de los Andes and the grant FP80740-064-2016 of COLCIENCIAS. The authors are grateful to L. M. Garzón, P. Palacios, M. Gonzalez and J. Hernandez for their contributions to collecting the samples and to J. Oswaldo Turizo for designing and manufacturing the amphibian electrical stimulator. A.S. and X.D. acknowledge support from National Cancer Institute award U01CA235507. The authors are grateful to S. Neuman for feedback regarding the XCMS deconvolution tool. Publisher Copyright: © 2020, The Author(s), under exclusive licence to Springer Nature America, Inc.
PY - 2021/2/1
Y1 - 2021/2/1
N2 - We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography–mass spectrometry (GC–MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC–MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples.
AB - We engineered a machine learning approach, MSHub, to enable auto-deconvolution of gas chromatography–mass spectrometry (GC–MS) data. We then designed workflows to enable the community to store, process, share, annotate, compare and perform molecular networking of GC–MS data within the Global Natural Product Social (GNPS) Molecular Networking analysis platform. MSHub/GNPS performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization and quantifies the reproducibility of fragmentation patterns across samples.
UR - http://www.scopus.com/inward/record.url?scp=85095678581&partnerID=8YFLogxK
U2 - 10.1038/s41587-020-0700-3
DO - 10.1038/s41587-020-0700-3
M3 - Article
C2 - 33169034
AN - SCOPUS:85095678581
SN - 1087-0156
VL - 39
SP - 169
EP - 173
JO - Nature Biotechnology
JF - Nature Biotechnology
IS - 2
ER -