Abstract
We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern p of length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give a Θ (n (k + log m log k) log n) time randomised algorithm which finds the correct answer with high probability. We then present a new deterministic Θ (n k2 log2 m) time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we develop an approach based on k-selectors that runs in Θ (n k polylog m) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.
Original language | English |
---|---|
Pages (from-to) | 115-124 |
Number of pages | 10 |
Journal | Journal of Computer and System Sciences |
Volume | 76 |
Issue number | 2 |
DOIs | |
State | Published - 1 Jan 2010 |
Externally published | Yes |
Keywords
- Group testing
- Pattern matching
- Randomised algorithms
- String algorithms
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science
- Computer Networks and Communications
- Computational Theory and Mathematics
- Applied Mathematics