TY - GEN

T1 - A sub-quadratic sequence alignment algorithm for unrestricted cost matrices

AU - Crochemore, Maxime

AU - Landau, Gad M.

AU - Ziv-Ukelson, Michal

PY - 2002/1/1

Y1 - 2002/1/1

N2 - The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in 0(n2) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations. The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n2/logn) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually 0(hn21 logn) where h ≤ 1 is the entropy of the text.

AB - The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in 0(n2) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations. The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n2/logn) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually 0(hn21 logn) where h ≤ 1 is the entropy of the text.

UR - http://www.scopus.com/inward/record.url?scp=84968754816&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84968754816

T3 - Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms

SP - 679

EP - 688

BT - Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002

PB - Association for Computing Machinery

T2 - 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002

Y2 - 6 January 2002 through 8 January 2002

ER -