Stochastic sampling of structural contexts improves the scalability and accuracy of rna 3d module identification

Roman Sarrazin-Gendron, Hua Ting Yao, Vladimir Reinharz, Carlos G. Oliver, Yann Ponty, Jérôme Waldispühl

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

RNA structures possess multiple levels of structural organization. Secondary structures are made of canonical (i.e. Watson-Crick and Wobble) helices, connected by loops whose local conformations are critical determinants of global 3D architectures. Such local 3D structures consist of conserved sets of non-canonical base pairs, called RNA modules. Their prediction from sequence data is thus a milestone toward 3D structure modelling. Unfortunately, the computational efficiency and scope of the current 3D module identification methods are too limited yet to benefit from all the knowledge accumulated in modules databases. Here, we introduce BayesPairing 2, a new sequence search algorithm leveraging secondary structure tree decomposition which allows to reduce the computational complexity and improve predictions on new sequences. We benchmarked our methods on 75 modules and 6380 RNA sequences, and report accuracies that are comparable to the state of the art, with considerable running time improvements. When identifying 200 modules on a single sequence, BayesPairing 2 is over 100 times faster than its previous version, opening new doors for genome-wide applications.

Original languageEnglish
Title of host publicationResearch in Computational Molecular Biology - 24th Annual International Conference, RECOMB 2020, Proceedings
EditorsRussell Schwartz
PublisherSpringer
Pages186-201
Number of pages16
ISBN (Print)9783030452568
DOIs
StatePublished - 1 Jan 2020
Externally publishedYes
Event24th Annual Conference on Research in Computational Molecular Biology, RECOMB 2020 - Padua, Italy
Duration: 10 May 202013 May 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12074 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference24th Annual Conference on Research in Computational Molecular Biology, RECOMB 2020
Country/TerritoryItaly
CityPadua
Period10/05/2013/05/20

Keywords

  • RNA 3D modules
  • RNA modules identification in sequence
  • RNA structure prediction

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Stochastic sampling of structural contexts improves the scalability and accuracy of rna 3d module identification'. Together they form a unique fingerprint.

Cite this