TY - GEN
T1 - Regular language constrained sequence alignment revisited
AU - Kucherov, Gregory
AU - Pinhas, Tamar
AU - Ziv-Ukelson, Michal
PY - 2011/4/4
Y1 - 2011/4/4
N2 - Imposing constraints in the form of a finite automaton or a regular expression is an effective way to incorporate additional a priori knowledge into sequence alignment procedures. With this motivation, Arslan [1] introduced the Regular Language Constrained Sequence Alignment Problem and proposed an O(n 2 t4) time and O(n2 t2) space algorithm for solving it, where n is the length of the input strings and t is the number of states in the non-deterministic automaton, which is given as input. Chung et al. [2] proposed a faster O(n2 t3) time algorithm for the same problem. In this paper, we further speed up the algorithms for Regular Language Constrained Sequence Alignment by reducing their worst case time complexity bound to O(n2 t3/logt). This is done by establishing an optimal bound on the size of Straight-Line Programs solving the maxima computation subproblem of the basic dynamic programming algorithm. We also study another solution based on a Steiner Tree computation. While it does not improve the run time complexity in the worst case, our simulations show that both approaches are efficient in practice, especially when the input automata are dense.
AB - Imposing constraints in the form of a finite automaton or a regular expression is an effective way to incorporate additional a priori knowledge into sequence alignment procedures. With this motivation, Arslan [1] introduced the Regular Language Constrained Sequence Alignment Problem and proposed an O(n 2 t4) time and O(n2 t2) space algorithm for solving it, where n is the length of the input strings and t is the number of states in the non-deterministic automaton, which is given as input. Chung et al. [2] proposed a faster O(n2 t3) time algorithm for the same problem. In this paper, we further speed up the algorithms for Regular Language Constrained Sequence Alignment by reducing their worst case time complexity bound to O(n2 t3/logt). This is done by establishing an optimal bound on the size of Straight-Line Programs solving the maxima computation subproblem of the basic dynamic programming algorithm. We also study another solution based on a Steiner Tree computation. While it does not improve the run time complexity in the worst case, our simulations show that both approaches are efficient in practice, especially when the input automata are dense.
UR - http://www.scopus.com/inward/record.url?scp=79953231638&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-19222-7_39
DO - 10.1007/978-3-642-19222-7_39
M3 - Conference contribution
AN - SCOPUS:79953231638
SN - 9783642192210
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 404
EP - 415
BT - Combinatorial Algorithms - 21st International Workshop, IWOCA 2010, Revised Selected Papers
T2 - 21st International Workshop on Combinatorial Algorithms, IWOCA 2010
Y2 - 26 July 2010 through 28 July 2010
ER -