Learning heuristics for mining RNA sequence-structure motifs

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

The computational identification of conserved motifs in RNA molecules is a major—yet largely unsolved—problem. Structural conservation serves as strong evidence for important RNA functionality. Thus, comparative structure analysis is the gold standard for the discovery and interpretation of functional RNAs.In this paper we focus on one of the functional RNA motif types, sequence-structure motifs in RNA molecules, which marks the molecule as targets to be recognized by other molecules.We present a new approach for the detection of RNA structure (including pseudoknots), which is conserved among a set of unaligned RNA sequences. Our method extends previous approaches for this problem, which were based on first identifying conserved stems and then assembling them into complex structural motifs. The novelty of our approach is in simultaneously preforming both the identification and the assembly of these stems. We believe this novel unified approach offers a more informative model for deciphering the evolution of functional RNAs, where the sets of stems comprising a conserved motif co-evolve as a correlated functional unit.Since the task of mining RNA sequence-structure motifs can be addressed by solving the maximum weighted clique problem in an n-partite graph, we translate the maximum weighted clique problem into a state graph. Then, we gather and define domain knowledge and low-level heuristics for this domain. Finally, we learn hyper-heuristics for this domain, which can be used with heuristic search algorithms (e.g., A*, IDA*) for the mining task.The hyper-heuristics are evolved using HH-Evolver, a tool for domain-specific, hyper-heuristic evolution. Our approach is designed to overcome the computational limitations of current algorithms, and to remove the necessity of previous assumptions that were used for sparsifying the graph.This is still work in progress and as yet we have no results to report. However, given the interest in the methodology and its previous success in other domains we are hopeful that these shall be forthcoming soon.
Original languageEnglish
Title of host publicationGenetic Programming Theory and Practice XIII
EditorsR. Riolo, W. Worzel , M. Kotanchek , A. Kordon
PublisherSpringer
Pages21-38
ISBN (Electronic)978-3-319-34223-8
ISBN (Print)978-3-319-34221-4
DOIs
StatePublished - 22 Dec 2016

Publication series

Name Genetic and Evolutionary Computation
ISSN (Print)1932-0167
ISSN (Electronic)1932-0175

Keywords

  • Genetic algorithms
  • Genetic programming
  • Hyper heuristic

Fingerprint

Dive into the research topics of 'Learning heuristics for mining RNA sequence-structure motifs'. Together they form a unique fingerprint.

Cite this