Constrained Gene Block Discovery and Its Application to Prokaryotic Genomes

Research output: Contribution to journalArticlepeer-review


Recent advances in Next Generation Sequencing techniques, combined with global efforts to study infectious diseases, yield huge and rapidly-growing databases of microbial genomes. These big new data statistically empower genomic-context based approaches to functional analysis: the idea is that groups of genes that are clustered locally together across many genomes usually express protein products that interact in the same biological pathway (e.g., operons). The problem of finding such conserved "gene blocks" in a given genomic data has been studied extensively. In this work, we propose a new gene block discovery problem variant: find conserved gene blocks abiding by a user specification of biological functional constraints. We take advantage of the biological constraints to efficiently prune the search space. This is achieved by modeling the new problem as a special constrained variant of the well-studied "Closed Frequent Itemset Mining" problem, generalized here to handle item duplications. We exemplify the application of the tool we developed for this problem with two different case studies related to microbial ATP (adenosine triphosphate)-binding cassette (ABC) transporters.

Original languageEnglish
Pages (from-to)745-766
Number of pages22
JournalJournal of Computational Biology
Issue number7
StatePublished - 1 Jul 2019


  • ABC transporters
  • conserved gene blocks
  • gene block discovery
  • gene teams
  • itemset mining

ASJC Scopus subject areas

  • Modeling and Simulation
  • Molecular Biology
  • Genetics
  • Computational Mathematics
  • Computational Theory and Mathematics


Dive into the research topics of 'Constrained Gene Block Discovery and Its Application to Prokaryotic Genomes'. Together they form a unique fingerprint.

Cite this