Abstract
The function of a protein is determined by its structure, which creates a need for efficient methods of protein structure determination to advance scientific and medical research. Because current experimental structure determination methods carry a high price tag, computational predictions are highly desirable. Given a protein sequence, computational methods produce numerous 3D structures known as decoys. Selection of the best quality decoys is both challenging and essential as the end users can handle only a few ones. Therefore, scoring functions are central to decoy selection. They combine measurable features into a single number indicator of decoy quality. Unfortunately, current scoring functions do not consistently select the best decoys. Machine learning techniques offer great potential to improve decoy scoring. This paper presents two machine-learning based scoring functions to predict the quality of proteins structures, i.e., the similarity between the predicted structure and the experimental one without knowing the latter. We use different metrics to compare these scoring functions against three state-of-the-art scores. This is a first attempt at comparing different scoring functions using the same non-redundant dataset for training and testing and the same features. The results show that adding informative features may be more significant than the method used.
Original language | English |
---|---|
Article number | 3370671 |
Pages (from-to) | 1515-1523 |
Number of pages | 9 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
Volume | 16 |
Issue number | 5 |
DOIs | |
State | Published - 1 Sep 2019 |
Keywords
- Algorithms
- Protein structure prediction
- decoy quality assessment
- ensemble learning
- machine learning
- machine learning
- performance
- protein features
- scoring functions
ASJC Scopus subject areas
- Biotechnology
- Genetics
- Applied Mathematics