TY - JOUR
T1 - Combined network analysis and machine learning allows the prediction of metabolic pathways from tomato metabolomics data
AU - Toubiana, David
AU - Puzis, Rami
AU - Wen, Lingling
AU - Sikron, Noga
AU - Kurmanbayeva, Assylay
AU - Soltabayeva, Aigerim
AU - del Mar Rubio Wilhelmi, Maria
AU - Sade, Nir
AU - Fait, Aaron
AU - Sagi, Moshe
AU - Blumwald, Eduardo
AU - Elovici, Yuval
N1 - Publisher Copyright:
© 2019, The Author(s).
PY - 2019/12/1
Y1 - 2019/12/1
N2 - The identification and understanding of metabolic pathways is a key aspect in crop improvement and drug design. The common approach for their detection is based on gene annotation and ontology. Correlation-based network analysis, where metabolites are arranged into network formation, is used as a complentary tool. Here, we demonstrate the detection of metabolic pathways based on correlation-based network analysis combined with machine-learning techniques. Metabolites of known tomato pathways, non-tomato pathways, and random sets of metabolites were mapped as subgraphs onto metabolite correlation networks of the tomato pericarp. Network features were computed for each subgraph, generating a machine-learning model. The model predicted the presence of the β-alanine-degradation-I, tryptophan-degradation-VII-via-indole-3-pyruvate (yet unknown to plants), the β-alanine-biosynthesis-III, and the melibiose-degradation pathway, although melibiose was not part of the networks. In vivo assays validated the presence of the melibiose-degradation pathway. For the remaining pathways only some of the genes encoding regulatory enzymes were detected.
AB - The identification and understanding of metabolic pathways is a key aspect in crop improvement and drug design. The common approach for their detection is based on gene annotation and ontology. Correlation-based network analysis, where metabolites are arranged into network formation, is used as a complentary tool. Here, we demonstrate the detection of metabolic pathways based on correlation-based network analysis combined with machine-learning techniques. Metabolites of known tomato pathways, non-tomato pathways, and random sets of metabolites were mapped as subgraphs onto metabolite correlation networks of the tomato pericarp. Network features were computed for each subgraph, generating a machine-learning model. The model predicted the presence of the β-alanine-degradation-I, tryptophan-degradation-VII-via-indole-3-pyruvate (yet unknown to plants), the β-alanine-biosynthesis-III, and the melibiose-degradation pathway, although melibiose was not part of the networks. In vivo assays validated the presence of the melibiose-degradation pathway. For the remaining pathways only some of the genes encoding regulatory enzymes were detected.
UR - http://www.scopus.com/inward/record.url?scp=85071177092&partnerID=8YFLogxK
U2 - 10.1038/s42003-019-0440-4
DO - 10.1038/s42003-019-0440-4
M3 - Article
AN - SCOPUS:85071177092
SN - 2399-3642
VL - 2
JO - Communications Biology
JF - Communications Biology
IS - 1
M1 - 214
ER -