TY - GEN
T1 - RG4detector
T2 - 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022
AU - Turner, Maor
AU - Barshai, Mira
AU - Orenstein, Yaron
N1 - Funding Information:
This research was partially supported by the Israel Cancer Association (grant no. 20221519) and by the Israeli Council for Higher Education (CHE) via Data Science Research Center, Ben-Gurion University of the Negev, Israel.
Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/8/7
Y1 - 2022/8/7
N2 - RNA G-quadruplexes (rG4s) are RNA secondary structures, which are formed by guanine-rich sequences and have important cellular functions. Thus, researchers would like to know where and when rG4s are formed throughout the transcriptome. Measuring rG4s experimentally is a long and lobarious process, and hence researchers often rely on computational methods to predict the rG4 propensity of a given RNA sequence. However, existing computational methods for rG4 propensity prediction are sub-optimal since they rely on specific sequence features and/or were trained on small datasets and without considering rG4 stability information. Here, we developed rG4detector, a convolutional neural network to predict the rG4 propensity of any given RNA sequence. We demonstrated that rG4detector outperforms existing methods over various transcriptomic datasets. In addition, we used rG4detector to detect potential rG4s in transcriptomic data, and showed that it improves detection performance compared to existing methods. Last, we interrogated rG4detector for the important features it learned and discovered known and novel molecular principles behind rG4 formation. We expect rG4detector to advance future rG4 research by accurate detection and propensity prediction of rG4s. The code, trained models, and processed datasets are publicly available via github.com/OrensteinLab/rG4detector.
AB - RNA G-quadruplexes (rG4s) are RNA secondary structures, which are formed by guanine-rich sequences and have important cellular functions. Thus, researchers would like to know where and when rG4s are formed throughout the transcriptome. Measuring rG4s experimentally is a long and lobarious process, and hence researchers often rely on computational methods to predict the rG4 propensity of a given RNA sequence. However, existing computational methods for rG4 propensity prediction are sub-optimal since they rely on specific sequence features and/or were trained on small datasets and without considering rG4 stability information. Here, we developed rG4detector, a convolutional neural network to predict the rG4 propensity of any given RNA sequence. We demonstrated that rG4detector outperforms existing methods over various transcriptomic datasets. In addition, we used rG4detector to detect potential rG4s in transcriptomic data, and showed that it improves detection performance compared to existing methods. Last, we interrogated rG4detector for the important features it learned and discovered known and novel molecular principles behind rG4 formation. We expect rG4detector to advance future rG4 research by accurate detection and propensity prediction of rG4s. The code, trained models, and processed datasets are publicly available via github.com/OrensteinLab/rG4detector.
KW - Deep neural networks
KW - RNA G-quadruplex
UR - http://www.scopus.com/inward/record.url?scp=85137324238&partnerID=8YFLogxK
U2 - 10.1145/3535508.3545534
DO - 10.1145/3535508.3545534
M3 - Conference contribution
AN - SCOPUS:85137324238
T3 - Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022
BT - Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, BCB 2022
PB - Association for Computing Machinery, Inc
Y2 - 7 August 2022 through 8 August 2022
ER -