TY - JOUR
T1 - A deep neural network approach for learning intrinsic protein-RNA binding preferences
AU - Ben-Bassat, Ilan
AU - Chor, Benny
AU - Orenstein, Yaron
N1 - Funding Information:
This work was supported by fellowships from the Edmond J. Safra Center for Bioinformatics at Tel-Aviv University, the Blavatnik Research Fund, and the Blavatnik Interdisciplinary Cyber Research Center in Tel-Aviv University.
Publisher Copyright:
© The Author(s) 2018. Published by Oxford University Press.
PY - 2018/9/1
Y1 - 2018/9/1
N2 - Motivation The complexes formed by binding of proteins to RNAs play key roles in many biological processes, such as splicing, gene expression regulation, translation and viral replication. Understanding protein-RNA binding may thus provide important insights to the functionality and dynamics of many cellular processes. This has sparked substantial interest in exploring protein-RNA binding experimentally, and predicting it computationally. The key computational challenge is to efficiently and accurately infer protein-RNA binding models that will enable prediction of novel protein-RNA interactions to additional transcripts of interest. Results We developed DLPRB (Deep Learning for Protein-RNA Binding), a new deep neural network (DNN) approach for learning intrinsic protein-RNA binding preferences and predicting novel interactions. We present two different network architectures: a convolutional neural network (CNN), and a recurrent neural network (RNN). The novelty of our network hinges upon two key aspects: (i) the joint analysis of both RNA sequence and structure, which is represented as a probability vector of different RNA structural contexts; (ii) novel features in the architecture of the networks, such as the application of RNNs to RNA-binding prediction, and the combination of hundreds of variable-length filters in the CNN. Our results in inferring accurate RNA-binding models from high-throughput in vitro data exhibit substantial improvements, compared to all previous approaches for protein-RNA binding prediction (both DNN and non-DNN based). A more modest, yet statistically significant, improvement is achieved for in vivo binding prediction. When incorporating experimentally-measured RNA structure, compared to predicted one, the improvement on in vivo data increases. By visualizing the binding specificities, we can gain biological insights underlying the mechanism of protein RNA-binding. Availability and implementation The source code is publicly available at https://github.com/ilanbb/dlprb. Supplementary information Supplementary data are available at Bioinformatics online.
AB - Motivation The complexes formed by binding of proteins to RNAs play key roles in many biological processes, such as splicing, gene expression regulation, translation and viral replication. Understanding protein-RNA binding may thus provide important insights to the functionality and dynamics of many cellular processes. This has sparked substantial interest in exploring protein-RNA binding experimentally, and predicting it computationally. The key computational challenge is to efficiently and accurately infer protein-RNA binding models that will enable prediction of novel protein-RNA interactions to additional transcripts of interest. Results We developed DLPRB (Deep Learning for Protein-RNA Binding), a new deep neural network (DNN) approach for learning intrinsic protein-RNA binding preferences and predicting novel interactions. We present two different network architectures: a convolutional neural network (CNN), and a recurrent neural network (RNN). The novelty of our network hinges upon two key aspects: (i) the joint analysis of both RNA sequence and structure, which is represented as a probability vector of different RNA structural contexts; (ii) novel features in the architecture of the networks, such as the application of RNNs to RNA-binding prediction, and the combination of hundreds of variable-length filters in the CNN. Our results in inferring accurate RNA-binding models from high-throughput in vitro data exhibit substantial improvements, compared to all previous approaches for protein-RNA binding prediction (both DNN and non-DNN based). A more modest, yet statistically significant, improvement is achieved for in vivo binding prediction. When incorporating experimentally-measured RNA structure, compared to predicted one, the improvement on in vivo data increases. By visualizing the binding specificities, we can gain biological insights underlying the mechanism of protein RNA-binding. Availability and implementation The source code is publicly available at https://github.com/ilanbb/dlprb. Supplementary information Supplementary data are available at Bioinformatics online.
UR - http://www.scopus.com/inward/record.url?scp=85054124932&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bty600
DO - 10.1093/bioinformatics/bty600
M3 - Article
C2 - 30423078
AN - SCOPUS:85054124932
SN - 1367-4803
VL - 34
SP - i638-i646
JO - Bioinformatics
JF - Bioinformatics
IS - 17
ER -