Abstract
We study the problem of finding the sequence of an unknown DNA fragment given the set of its k-long subsequences and a homologous sequence, namely a sequence that is similar to the target sequence. Such a sequence is available in some applications, e.g., when detecting single nucleotide polymorphisms. Pe'er and Shamir studied this problem and presented a heuristic algorithm for it. In this paper, we give an algorithm with provable performance: We show that under some assumptions, the algorithm can reconstruct a random sequence of length O(4k) with high probability. We also show that no algorithm can reconstruct sequences of length Ω(log k· 4k).
Original language | English |
---|---|
Pages (from-to) | 498-511 |
Number of pages | 14 |
Journal | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Volume | 2812 |
DOIs | |
State | Published - 1 Jan 2003 |
Externally published | Yes |
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science