Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium

Daniel Fischer, David Eisenberg

Research output: Contribution to journalArticlepeer-review

104 Scopus citations


A crucial step in exploiting the information inherent in genome sequences is to assign to each protein sequence its three-dimensional fold and biological function. Here we describe fold assignment for the proteins encoded by the small genome of Mycoplasma genitalium. The assignment was carried out by our computer server (, which assigns folds to amino acid sequences by comparing sequence-derived predictions with known structures. Of the total of 468 protein ORFs, 103 (22%) can be assigned a known protein fold with high confidence, as cross-validated with tests on known structures. Of these sequences, 75 116%) show enough sequence similarity to proteins of known structure that they can also be detected by traditional sequence-sequence comparison methods. That is, the difference of 28 sequences (6%) are assignable by the sequence-structure method of the server but not by current sequence-sequence methods. Of the remaining 78% of sequences in the genome, 18% belong to membrane proteins and the remaining 60% cannot be assigned either because these sequences correspond to no presently known fold or because of insensitivity of the method. At the current rate of determination of new folds by x-ray and NMR methods, extrapolation suggests that folds will be assigned to most soluble proteins in the next decade.

Original languageEnglish
Pages (from-to)11929-11934
Number of pages6
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number22
StatePublished - 28 Oct 1997
Externally publishedYes


  • Computer analysis of genome sequences
  • Protein fold recognition


Dive into the research topics of 'Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium'. Together they form a unique fingerprint.

Cite this