A sub-quadratic sequence alignment algorithm for unrestricted cost matrices

Maxime Crochemore, Gad M. Landau, Michal Ziv-Ukelson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

45 Scopus citations

Abstract

The classical algorithm for computing the similarity between two sequences [36, 39] uses a dynamic programming matrix, and compares two strings of size n in 0(n2) time. We address the challenge of computing the similarity of two strings in sub-quadratic time, for metrics which use a scoring matrix of unrestricted weights. Our algorithm applies to both local and global alignment computations. The speed-up is achieved by dividing the dynamic programming matrix into variable sized blocks, as induced by Lempel-Ziv parsing of both strings, and utilizing the inherent periodic nature of both strings. This leads to an O(n2/logn) algorithm for an input of constant alphabet size. For most texts, the time complexity is actually 0(hn21 logn) where h ≤ 1 is the entropy of the text.

Original languageEnglish
Title of host publicationProceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002
PublisherAssociation for Computing Machinery
Pages679-688
Number of pages10
ISBN (Electronic)089871513X
StatePublished - 1 Jan 2002
Externally publishedYes
Event13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002 - San Francisco, United States
Duration: 6 Jan 20028 Jan 2002

Publication series

NameProceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms
Volume06-08-January-2002

Conference

Conference13th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2002
Country/TerritoryUnited States
CitySan Francisco
Period6/01/028/01/02

ASJC Scopus subject areas

  • Software
  • General Mathematics

Fingerprint

Dive into the research topics of 'A sub-quadratic sequence alignment algorithm for unrestricted cost matrices'. Together they form a unique fingerprint.

Cite this