Purely structural protein scoring functions using support vector machine and ensemble learning

Shokoufeh Mirzaei, Tomer Sidi, Chen Keasar, Silvia Crivelli

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

The function of a protein is determined by its structure, which creates a need for efficient methods of protein structure determination to advance scientific and medical research. Because current experimental structure determination methods carry a high price tag, computational predictions are highly desirable. Given a protein sequence, computational methods produce numerous 3D structures known as decoys. Selection of the best quality decoys is both challenging and essential as the end users can handle only a few ones. Therefore, scoring functions are central to decoy selection. They combine measurable features into a single number indicator of decoy quality. Unfortunately, current scoring functions do not consistently select the best decoys. Machine learning techniques offer great potential to improve decoy scoring. This paper presents two machine-learning based scoring functions to predict the quality of proteins structures, i.e., the similarity between the predicted structure and the experimental one without knowing the latter. We use different metrics to compare these scoring functions against three state-of-the-art scores. This is a first attempt at comparing different scoring functions using the same non-redundant dataset for training and testing and the same features. The results show that adding informative features may be more significant than the method used.

Original languageEnglish
Article number3370671
Pages (from-to)1515-1523
Number of pages9
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume16
Issue number5
DOIs
StatePublished - 1 Sep 2019

Keywords

  • Algorithms
  • Protein structure prediction
  • decoy quality assessment
  • ensemble learning
  • machine learning
  • machine learning
  • performance
  • protein features
  • scoring functions

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Purely structural protein scoring functions using support vector machine and ensemble learning'. Together they form a unique fingerprint.

Cite this