Optimal Spaced Seeds for Faster Approximate String Matching

Martin Farach-Colton, Gad M. Landau, S. Cenk Sahinalp, Dekel Tsur

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Filtering is a standard technique for fast approximate string matching in practice. In filtering, a quick first step is used to rule out almost all positions of a text as possible starting positions for a pattern. Typically this step consists of finding the exact matches of small parts of the pattern. In the followup step, a slow method is used to verify or eliminate each remaining position. The running time of such a method depends largely on the quality of the filtering step, as measured by its false positives rate. The quality of such a method depends on the number of true matches that it misses, that is, on its false negative rate. A spaced seed is a recently introduced type of filter pattern that allows gaps (i.e. don't cares) in the small sub-pattern to be searched for. Spaced seeds promise to yield a much lower false positives rate, and thus have been extensively studied, though heretofore only heuristically or statistically. In this paper, we show how to optimally design spaced seeds that yield no false negatives.

Original languageEnglish
Title of host publicationAutomata, Languages and Programming
Subtitle of host publication32nd International Colloquium, ICALP 2005, Proceedings
EditorsLuís Caires, Giuseppe F. Italiano, Luís Monteiro, Catuscia Palamidessi, Moti Yung
PublisherSpringer
Pages1251-1262
Number of pages12
ISBN (Print)9783540275800
DOIs
StatePublished - 2005
Externally publishedYes
Event32nd International Colloquium on Automata, Languages and Programming, ICALP 2005 - Lisbon, Portugal
Duration: 11 Jul 200515 Jul 2005

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Verlag
Volume3580
ISSN (Print)0302-9743

Conference

Conference32nd International Colloquium on Automata, Languages and Programming, ICALP 2005
Country/TerritoryPortugal
CityLisbon
Period11/07/0515/07/05

Fingerprint

Dive into the research topics of 'Optimal Spaced Seeds for Faster Approximate String Matching'. Together they form a unique fingerprint.

Cite this