TY - JOUR
T1 - Amino acid pair interchanges at spatially conserved locations
AU - Naor, Dalit
AU - Fischer, Daniel
AU - Jernigan, Robert L.
AU - Wolfson, Haim J.
AU - Nussinov, Ruth
N1 - Funding Information:
We thank C.-J. Tsai for the calculation of the amino acid surface area. D.N. acknowledges support from the Eshkol Post-Doctoral Fellowship, and from the Rothchild Post-Doctoral Fellowship for the Human Genome, administered by the Israeli Ministry of Science. The research of R.N. has been sponsored by the National Cancer Institute, DHHS, under contract no. 1-CO-74102 with SAIC. The contents of this publication do not necessarily reflect the views or policies of the DHHS, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. The research of R.N. in Tel Aviv University has been supported in part by grant. no. 91-00219 with R.J., from the U.S.–Israel Binational Science Foundation (BSF), Jerusalem, Israel. The research of H.J.W. and R.N. in Israel has been supported in part by a grant from the Israel Science Foundation administered by the Israel Academy of Sciences.
PY - 1996/3/15
Y1 - 1996/3/15
N2 - Here we study the pattern of amino acid pair interchanges at spatially, locally conserved regions in globally dissimilar and unrelated proteins. By using a method which completely separates the amino acid sequence from its respective structure, this work addresses the question of which properties of the amino acids are the most crucial for the stability of conserved structural motifs. The proteins are taken from a structurally non-redundant dataset. The spatially conserved substructural motifs are defined as consisting of a 'large enough' number of C(α) atoms found to provide a geometric match between two proteins, regardless of the order of the C(α) atoms in the sequence, or of the sequence composition of the substructures. This approach can apply to proteins with little or no sequence similarity but with sufficient structural similarity, and is unique in its ability to handle local, non-topological matches between pairs of dissimilar proteins. The method uses a computer-vision based algorithm, the Geometric Hashing. Since the Geometric Hashing ignores sequence information it lends itself to answer the quest-ion posed above. The interchanges at geometrically similar positions that have been obtained with our method demonstrate the expected behaviour. Yet, a closer inspection reveals some distinct characteristics, as compared with interchanges based upon sequence-order based techniques, or from energy-contact-based considerations. First, a pronounced division of the amino acids into two classes is displayed: Lys, Glu, Arg, Gln, Asp, Asn, Pro, Gly, Thr, Ser and His on the one hand, and Ile, Val, Leu, Phe, Met, Tyr, Trp, Cys and Ala on the other. These groups further cluster into subgroups: Lys, Glu, Arg, Gin; Asp Asn; Pro, Gly; Ile, Val, Leu, Phe. The other amino acids stand alone. Analysis of the conservation among amino acids indicates proline to be consistently, by far, the most conserved. Next are Asp, Glu, Lys and Gly. Cys is also highly conserved. Interestingly, oppositely charged amino acids are interchanged roughly as frequently as those of the same charge. These observations can be explained in terms of the three-dimensional structures of the proteins. Most of all, there is a clear distinction between residues which prefer to be on the protein surfaces, compared to those frequently buried in the interiors. Analysis of the interchanges indicates their low information content. This, together with the separation into two groups, suggests that the predictive value of the spatial positions of the C(α) atoms is not much greater than the sequence alone, aside from their hydrophobicity/hydrophillicity classification.
AB - Here we study the pattern of amino acid pair interchanges at spatially, locally conserved regions in globally dissimilar and unrelated proteins. By using a method which completely separates the amino acid sequence from its respective structure, this work addresses the question of which properties of the amino acids are the most crucial for the stability of conserved structural motifs. The proteins are taken from a structurally non-redundant dataset. The spatially conserved substructural motifs are defined as consisting of a 'large enough' number of C(α) atoms found to provide a geometric match between two proteins, regardless of the order of the C(α) atoms in the sequence, or of the sequence composition of the substructures. This approach can apply to proteins with little or no sequence similarity but with sufficient structural similarity, and is unique in its ability to handle local, non-topological matches between pairs of dissimilar proteins. The method uses a computer-vision based algorithm, the Geometric Hashing. Since the Geometric Hashing ignores sequence information it lends itself to answer the quest-ion posed above. The interchanges at geometrically similar positions that have been obtained with our method demonstrate the expected behaviour. Yet, a closer inspection reveals some distinct characteristics, as compared with interchanges based upon sequence-order based techniques, or from energy-contact-based considerations. First, a pronounced division of the amino acids into two classes is displayed: Lys, Glu, Arg, Gln, Asp, Asn, Pro, Gly, Thr, Ser and His on the one hand, and Ile, Val, Leu, Phe, Met, Tyr, Trp, Cys and Ala on the other. These groups further cluster into subgroups: Lys, Glu, Arg, Gin; Asp Asn; Pro, Gly; Ile, Val, Leu, Phe. The other amino acids stand alone. Analysis of the conservation among amino acids indicates proline to be consistently, by far, the most conserved. Next are Asp, Glu, Lys and Gly. Cys is also highly conserved. Interestingly, oppositely charged amino acids are interchanged roughly as frequently as those of the same charge. These observations can be explained in terms of the three-dimensional structures of the proteins. Most of all, there is a clear distinction between residues which prefer to be on the protein surfaces, compared to those frequently buried in the interiors. Analysis of the interchanges indicates their low information content. This, together with the separation into two groups, suggests that the predictive value of the spatial positions of the C(α) atoms is not much greater than the sequence alone, aside from their hydrophobicity/hydrophillicity classification.
KW - Amino acid conservation
KW - Amino acid interchanges
KW - Geometric hashing
KW - Structural motifs
KW - Structure comparison
UR - http://www.scopus.com/inward/record.url?scp=0029986889&partnerID=8YFLogxK
U2 - 10.1006/jmbi.1996.0138
DO - 10.1006/jmbi.1996.0138
M3 - Article
C2 - 8601843
AN - SCOPUS:0029986889
SN - 0022-2836
VL - 256
SP - 924
EP - 938
JO - Journal of Molecular Biology
JF - Journal of Molecular Biology
IS - 5
ER -