TY - JOUR
T1 - Reconstruction Codes for DNA Sequences with Uniform Tandem-Duplication Errors
AU - Yehezkeally, Yonatan
AU - Schwartz, Moshe
N1 - Funding Information:
Manuscript received October 5, 2018; revised September 4, 2019; accepted September 5, 2019. Date of publication September 10, 2019; date of current version April 21, 2020. This work was supported by the Israel Science Foundation (ISF) under Grant 270/18. This article was presented in part at ISIT’2018.
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction problem, which is applicable whenever multiple reads of noisy data are available. This strategy is uniquely suited to the medium, which inherently replicates stored data in multiple distinct ways, caused by mutations. We consider noise introduced solely by uniform tandem-duplication, and utilize the relation to constant-weight integer codes in the Manhattan metric. By bounding the intersection of the cross-polytope with hyperplanes, we prove the existence of reconstruction codes with full rate, as well as suggest a construction for a family of reconstruction codes.
AB - DNA as a data storage medium has several advantages, including far greater data density compared to electronic media. We propose that schemes for data storage in the DNA of living organisms may benefit from studying the reconstruction problem, which is applicable whenever multiple reads of noisy data are available. This strategy is uniquely suited to the medium, which inherently replicates stored data in multiple distinct ways, caused by mutations. We consider noise introduced solely by uniform tandem-duplication, and utilize the relation to constant-weight integer codes in the Manhattan metric. By bounding the intersection of the cross-polytope with hyperplanes, we prove the existence of reconstruction codes with full rate, as well as suggest a construction for a family of reconstruction codes.
KW - DNA storage
KW - reconstruction
KW - string-duplication systems
KW - tandem-duplication errors
UR - http://www.scopus.com/inward/record.url?scp=85084108906&partnerID=8YFLogxK
U2 - 10.1109/TIT.2019.2940256
DO - 10.1109/TIT.2019.2940256
M3 - Article
AN - SCOPUS:85084108906
SN - 0018-9448
VL - 66
SP - 2658
EP - 2668
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 5
M1 - 8830407
ER -