Abstract
In this paper, we introduce and analyze two graph-based models for assigning orthologs in the presence of whole-genome duplications, using similarity information between pairs of genes. The common feature of our two models is that genes of the first genome may be assigned two orthologs from the second genome, which has undergone a whole-genome duplication. Additionally, our models incorporate the new notion of duplication bonus, a parameter that reflects how assigning two orthologs to a given gene should be rewarded or penalized. Our work is mainly focused on developing exact and reasonably time-consuming algorithms for these two models: we show that the first one is polynomial-time solvable, while the second is NP-hard. For the latter, we thus design two fixed-parameter algorithms, i.e. exact algorithms whose running times are exponential only with respect to a small and well-chosen input parameter. Finally, for both models, we evaluate our algorithms on pairs of plant genomes. Our experiments show that the NP-hard model yields a better cluster quality at the cost of lower coverage, due to the fact that our instances cannot be completely solved by our algorithms. However, our results are altogether encouraging and show that our methods yield biologically significant predictions of orthologs when the duplication bonus value is properly chosen.
Original language | English |
---|---|
Pages (from-to) | 379-390 |
Number of pages | 12 |
Journal | Computational Biology and Chemistry |
Volume | 74 |
DOIs | |
State | Published - 1 Jun 2018 |
Keywords
- Comparative genomics
- Graph algorithms
- NP-hard problem
- Plant genomics
- Synteny blocks
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Organic Chemistry
- Computational Mathematics