TY - GEN

T1 - Constrained LCS

T2 - 19th Annual Symposium on Combinatorial Pattern Matching, CPM 2008

AU - Gotthilf, Zvi

AU - Hermelin, Danny

AU - Lewenstein, Moshe

PY - 2008/7/1

Y1 - 2008/7/1

N2 - The problem of finding the longest common subsequence (LCS) of two given strings A 1 and A 2 is a well-studied problem. The constrained longest common subsequence (C-LCS) for three strings A 1, A 2 and B 1 is the longest common subsequence of A 1 and A 2 that contains B 1 as a subsequence. The fastest algorithm solving the C-LCS problem has a time complexity of O(m 1 m 2 n 1) where m 1, m 2 and n 1 are the lengths of A 1, A 2 and B 1 respectively. In this paper we consider two general variants of the C-LCS problem. First we show that in case of two input strings and an arbitrary number of constraint strings, it is NP-hard to approximate the C-LCS problem. Moreover, it is easy to see that in case of an arbitrary number of input strings and a single constraint, the problem of finding the constrained longest common subsequence is NP-hard. Therefore, we propose a linear time approximation algorithm for this variant, our algorithm yields a 1/ √m min|∑| approximation factor, where mmin is the length of the shortest input string and |∑| is the size of the alphabet.

AB - The problem of finding the longest common subsequence (LCS) of two given strings A 1 and A 2 is a well-studied problem. The constrained longest common subsequence (C-LCS) for three strings A 1, A 2 and B 1 is the longest common subsequence of A 1 and A 2 that contains B 1 as a subsequence. The fastest algorithm solving the C-LCS problem has a time complexity of O(m 1 m 2 n 1) where m 1, m 2 and n 1 are the lengths of A 1, A 2 and B 1 respectively. In this paper we consider two general variants of the C-LCS problem. First we show that in case of two input strings and an arbitrary number of constraint strings, it is NP-hard to approximate the C-LCS problem. Moreover, it is easy to see that in case of an arbitrary number of input strings and a single constraint, the problem of finding the constrained longest common subsequence is NP-hard. Therefore, we propose a linear time approximation algorithm for this variant, our algorithm yields a 1/ √m min|∑| approximation factor, where mmin is the length of the shortest input string and |∑| is the size of the alphabet.

UR - http://www.scopus.com/inward/record.url?scp=45849130954&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-69068-9_24

DO - 10.1007/978-3-540-69068-9_24

M3 - Conference contribution

AN - SCOPUS:45849130954

SN - 3540690662

SN - 9783540690665

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 255

EP - 262

BT - Combinatorial Pattern Matching - 19th Annual Symposium, CPM 2008, Proceedings

Y2 - 18 June 2008 through 20 June 2008

ER -