Reconstruction Codes for DNA Sequences with Uniform Tandem-Duplication Errors

Yonatan Yehezkeally, Moshe Schwartz

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction problem, which is applicable whenever multiple reads of noisy data are available. This strategy is uniquely suited to the medium, which inherently replicates stored data in multiple distinct ways, caused by mutations. We consider noise introduced solely by uniform tandem-duplication, and utilize the relation to constant-weight integer codes in the Manhattan metric. By bounding the intersection of the cross-polytope with hyperplanes, we prove the existence of reconstruction codes with full rate, as well as suggest a construction for a family of reconstruction codes.

Original languageEnglish
Article number8830407
Pages (from-to)2658-2668
Number of pages11
JournalIEEE Transactions on Information Theory
Volume66
Issue number5
DOIs
StatePublished - 1 May 2020

Keywords

  • DNA storage
  • reconstruction
  • string-duplication systems
  • tandem-duplication errors

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Reconstruction Codes for DNA Sequences with Uniform Tandem-Duplication Errors'. Together they form a unique fingerprint.

Cite this