TY - JOUR
T1 - Rank-modulation codes for DNA storage with shotgun sequencing
AU - Raviv, Netanel
AU - Schwartz, Moshe
AU - Yaakobi, Eitan
N1 - Funding Information:
Manuscript received August 7, 2017; revised January 22, 2018; accepted April 12, 2018. Date of publication April 25, 2018; date of current version December 19, 2018. This work was supported by the Israel Science Foundation under Grant 130/14 and Grant 1624/14. This paper was presented in part at the 2017 IEEE International Symposium on Information Theory.
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Synthesis of DNA molecules offers unprecedented advances in storage technology. Yet, the microscopic world in which these molecules reside induces error patterns that are fundamentally different from their digital counterparts. Hence, to maintain reliability in reading and writing, new coding schemes must be developed. In a reading technique called shotgun sequencing, a long DNA string is read in a sliding window fashion, and a profile vector is produced. It was recently suggested by Kiah et al. that such a vector can represent the permutation which is induced by its entries, and hence a rank-modulation scheme arises. Although this interpretation suggests high error tolerance, it is unclear which permutations are feasible and how to produce a DNA string whose profile vector induces a given permutation. In this paper, by observing some necessary conditions, an upper bound for the number of feasible permutations is given. Furthermore, a technique for deciding the feasibility of a permutation is devised. By using insights from this technique, an algorithm for producing a considerable number of feasible permutations is given, which applies to any alphabet size and any window length.
AB - Synthesis of DNA molecules offers unprecedented advances in storage technology. Yet, the microscopic world in which these molecules reside induces error patterns that are fundamentally different from their digital counterparts. Hence, to maintain reliability in reading and writing, new coding schemes must be developed. In a reading technique called shotgun sequencing, a long DNA string is read in a sliding window fashion, and a profile vector is produced. It was recently suggested by Kiah et al. that such a vector can represent the permutation which is induced by its entries, and hence a rank-modulation scheme arises. Although this interpretation suggests high error tolerance, it is unclear which permutations are feasible and how to produce a DNA string whose profile vector induces a given permutation. In this paper, by observing some necessary conditions, an upper bound for the number of feasible permutations is given. Furthermore, a technique for deciding the feasibility of a permutation is devised. By using insights from this technique, an algorithm for producing a considerable number of feasible permutations is given, which applies to any alphabet size and any window length.
KW - DNA storage
KW - DeBruijn graphs
KW - permutations codes
UR - http://www.scopus.com/inward/record.url?scp=85045997214&partnerID=8YFLogxK
U2 - 10.1109/TIT.2018.2829876
DO - 10.1109/TIT.2018.2829876
M3 - Article
AN - SCOPUS:85045997214
SN - 0018-9448
VL - 65
SP - 50
EP - 64
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 1
M1 - 8347019
ER -