Reconstruction of a Single String from a Part of its Composition Multiset

Zuo Ye, Ohad Elishco

Research output: Contribution to journalArticlepeer-review

Abstract

Motivated by applications in polymer-based data storage, we study the problem of reconstructing a string from part of its composition multiset. We give a full description of strings that cannot be uniquely reconstructed up to reversal from their multisets of all the prefix-suffix compositions. Leveraging this description, we prove that for all <italic>n</italic> &#x2A7E; 6, there exists a string of length <italic>n</italic> that cannot be uniquely reconstructed up to reversal. Moreover, for all <italic>n</italic> &#x2A7E; 6, we explicitly construct the set consisting of all length <italic>n</italic> strings that can be uniquely reconstructed up to reversal. As a byproduct, we obtain that any binary string can be constructed using Dyck strings and Catalan-Bertrand strings. For any given string s, we provide a method to explicitly construct the set of all strings with the same prefix-suffix composition multiset as s, as well as a formula for the size of this set. Furthermore, we construct two classes of composition codes that can respectively correct composition missing errors and mass-reducing substitution errors. In addition, we raise a new problem: reconstructing a string when only given its compositions of substrings of length at most <italic>r</italic>. We give suitable codes under some conditions.

Original languageEnglish
Pages (from-to)1
Number of pages1
JournalIEEE Transactions on Information Theory
DOIs
StateAccepted/In press - 1 Jan 2023

Keywords

  • Buffer storage
  • Codes
  • Dyck strings
  • Media
  • Memory
  • Polymer-based storage
  • Polymers
  • Redundancy
  • Symbols
  • composition codes
  • unique string reconstruction

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Reconstruction of a Single String from a Part of its Composition Multiset'. Together they form a unique fingerprint.

Cite this