TY - JOUR
T1 - DTI-CDF
T2 - A cascade deep forest model towards the prediction of drug-target interactions based on hybrid features
AU - Chu, Yanyi
AU - Kaushik, Aman Chandra
AU - Wang, Xiangeng
AU - Wang, Wei
AU - Zhang, Yufang
AU - Shan, Xiaoqi
AU - Salahub, Dennis Russell
AU - Xiong, Yi
AU - Wei, Dong Qing
N1 - Publisher Copyright:
© 2019 The Author(s).
PY - 2021/1/1
Y1 - 2021/1/1
N2 - Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- A nd resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
AB - Drug-target interactions (DTIs) play a crucial role in target-based drug discovery and development. Computational prediction of DTIs can effectively complement experimental wet-lab techniques for the identification of DTIs, which are typically time- A nd resource-consuming. However, the performances of the current DTI prediction approaches suffer from a problem of low precision and high false-positive rate. In this study, we aim to develop a novel DTI prediction method for improving the prediction performance based on a cascade deep forest (CDF) model, named DTI-CDF, with multiple similarity-based features between drugs and the similarity-based features between target proteins extracted from the heterogeneous graph, which contains known DTIs. In the experiments, we built five replicates of 10-fold cross-validation under three different experimental settings of data sets, namely, corresponding DTI values of certain drugs (SD), targets (ST), or drug-target pairs (SP) in the training sets are missed but existed in the test sets. The experimental results demonstrate that our proposed approach DTI-CDF achieves a significantly higher performance than that of the traditional ensemble learning-based methods such as random forest and XGBoost, deep neural network, and the state-of-the-art methods such as DDR. Furthermore, there are 1352 newly predicted DTIs which are proved to be correct by KEGG and DrugBank databases. The data sets and source code are freely available at https://github.com//a96123155/DTI-CDF.
KW - Drug-target interaction
KW - cascade deep forest
KW - ensemble learning
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85100280116&partnerID=8YFLogxK
U2 - 10.1093/bib/bbz152
DO - 10.1093/bib/bbz152
M3 - Article
C2 - 31885041
AN - SCOPUS:85100280116
SN - 1467-5463
VL - 22
SP - 451
EP - 462
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 1
ER -