Reconstruction of a Single String From a Part of Its Composition Multiset

Zuo Ye, Ohad Elishco

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Motivated by applications in polymer-based data storage, we study the problem of reconstructing a string from part of its composition multiset. We give a full description of strings that cannot be uniquely reconstructed up to reversal from their multisets of all the prefix-suffix compositions. Leveraging this description, we prove that for all n ? 6, there exists a string of length n that cannot be uniquely reconstructed up to reversal. Moreover, for all n ? 6, we explicitly construct the set consisting of all length n strings that can be uniquely reconstructed up to reversal. As a byproduct, we obtain that any binary string can be constructed using Dyck strings and Catalan-Bertrand strings. For any given string s, we provide a method to explicitly construct the set of all strings with the same prefix-suffix composition multiset as s, as well as a formula for the size of this set. Furthermore, we construct two classes of composition codes that can respectively correct composition missing errors and mass-reducing substitution errors. In addition, we raise a new problem: reconstructing a string when only given its compositions of substrings of length at most r. We give suitable codes under some conditions.

Original languageEnglish
Pages (from-to)3922-3940
Number of pages19
JournalIEEE Transactions on Information Theory
Volume70
Issue number6
DOIs
StatePublished - 1 Jun 2024

Keywords

  • Dyck strings
  • Polymer-based storage
  • composition codes
  • unique string reconstruction

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Reconstruction of a Single String From a Part of Its Composition Multiset'. Together they form a unique fingerprint.

Cite this