TY - JOUR
T1 - CSBFinder
T2 - Discovery of colinear syntenic blocks across thousands of prokaryotic genomes
AU - Svetlitsky, Dina
AU - Dagan, Tal
AU - Chalifa-Caspi, Vered
AU - Ziv-Ukelson, Michal
N1 - Funding Information:
The research of T.D. was partially funded by the European Research Council (Grant No. 281357). The research of D.S. and M.Z.U. was partially funded by the Israel Science Foundation (Grant No. 179/14 and Grant No. 939/18).
Publisher Copyright:
© 2018 The Author.
PY - 2019/5/15
Y1 - 2019/5/15
N2 - Motivation: Identification of conserved syntenic blocks across microbial genomes is important for several problems in comparative genomics such as gene annotation, study of genome organization and evolution and prediction of gene interactions. Current tools for syntenic block discovery do not scale up to the large quantity of prokaryotic genomes available today. Results: We present a novel methodology for the discovery, ranking and taxonomic distribution analysis of colinear syntenic blocks (CSBs)-groups of genes that are consistently located close to each other, in the same order, across a wide range of taxa. We present an efficient algorithm that identifies CSBs in large genomic datasets. The algorithm is implemented and incorporated in a novel tool with a graphical user interface, denoted CSBFinder, that ranks the discovered CSBs according to a probabilistic score and clusters them to families according to their gene content similarity. We apply CSBFinder to data mine 1487 prokaryotic genomes including chromosomes and plasmids. For post-processing analysis, we generate heatmaps for visualizing the distribution of CSB family members across various taxa. We exemplify the utility of CSBFinder in operon prediction, in deciphering unknown gene function and in taxonomic analysis of colinear syntenic blocks.
AB - Motivation: Identification of conserved syntenic blocks across microbial genomes is important for several problems in comparative genomics such as gene annotation, study of genome organization and evolution and prediction of gene interactions. Current tools for syntenic block discovery do not scale up to the large quantity of prokaryotic genomes available today. Results: We present a novel methodology for the discovery, ranking and taxonomic distribution analysis of colinear syntenic blocks (CSBs)-groups of genes that are consistently located close to each other, in the same order, across a wide range of taxa. We present an efficient algorithm that identifies CSBs in large genomic datasets. The algorithm is implemented and incorporated in a novel tool with a graphical user interface, denoted CSBFinder, that ranks the discovered CSBs according to a probabilistic score and clusters them to families according to their gene content similarity. We apply CSBFinder to data mine 1487 prokaryotic genomes including chromosomes and plasmids. For post-processing analysis, we generate heatmaps for visualizing the distribution of CSB family members across various taxa. We exemplify the utility of CSBFinder in operon prediction, in deciphering unknown gene function and in taxonomic analysis of colinear syntenic blocks.
UR - http://www.scopus.com/inward/record.url?scp=85066061570&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bty861
DO - 10.1093/bioinformatics/bty861
M3 - Article
AN - SCOPUS:85066061570
VL - 35
SP - 1634
EP - 1643
JO - Bioinformatics
JF - Bioinformatics
SN - 1367-4803
IS - 10
ER -