On-line hierarchy of general linear models for selecting and ranking the best predicted protein structures

Hani Zakaria Girgis, Jason J. Corso, Daniel Fischer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

To predict the three dimensional structure of proteins, many computational methods sample the conformational space, generating a large number of candidate structures. Subsequently, such methods rank the generated structures using a variety of model quality assessment programs in order to obtain a small set of structures that are most likely to resemble the unknown experimentally determined structure. Model quality assessment programs suffer from two main limitations: (i) the rank-one structure is not always the best predicted structure; in other words, the best predicted structure could be ranked as the 10th structure (ii) no single assessment method can correctly rank the predicted structures for all target proteins. However, because often at least some of the methods achieve a good ranking, a model quality assessment method that is based on a consensus of a number of model quality assessment methods is likely to perform better. We have devised the STPdata algorithm, a consensus method based on five model quality assessment programs. We have applied it to build an on-line "custom-trained" hierarchy of general linear models to select and rank the best predicted structures. By "custom-trained", we mean for each target protein the STPdata algorithm trains a unique model on data related to the input target protein. To evaluate our method we participated in CASP8 as human predictors. In CASP8, the STPdata algorithm has trained 128 hierarchical models for each of the 128 target proteins. Based on the official results of CASP8 our method outperformed the best server by 6% and won the fourth position among human predictors. Our CASP results are purely based on computational methods without any human intervention.

Original languageEnglish
Title of host publicationProceedings of the 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society
Subtitle of host publicationEngineering the Future of Biomedicine, EMBC 2009
PublisherIEEE Computer Society
Pages4949-4953
Number of pages5
ISBN (Print)9781424432967
DOIs
StatePublished - 1 Jan 2009
Externally publishedYes
Event31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society: Engineering the Future of Biomedicine, EMBC 2009 - Minneapolis, MN, United States
Duration: 2 Sep 20096 Sep 2009

Publication series

NameProceedings of the 31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society: Engineering the Future of Biomedicine, EMBC 2009

Conference

Conference31st Annual International Conference of the IEEE Engineering in Medicine and Biology Society: Engineering the Future of Biomedicine, EMBC 2009
Country/TerritoryUnited States
CityMinneapolis, MN
Period2/09/096/09/09

Fingerprint

Dive into the research topics of 'On-line hierarchy of general linear models for selecting and ranking the best predicted protein structures'. Together they form a unique fingerprint.

Cite this