Introduction to Protein Structure
Proteins mediate the majority of biological processes. All proteins share the common feature that they are condensation polymers of amino acids whose sequence is specified by the genetic information contained within the genome of the organism. Complete DNA sequences for organisms ranging from Escherichia coli to humans suggest that the total number of proteins necessary for life lies in the range of 4200–50,000, although the number of genes in higher organisms is still under debate. Most of these proteins adopt a well-defined three-dimensional structure in solution that is essential for protein function. Indeed unfolding or denaturation of a protein typically leads to a loss of biological activityThe amino acid sequence of a protein contains all of the information necessary to dictate its final threedimensional structure or fold. In many cases small proteins can be unfolded and refolded in vitro without loss of activity. In more complex proteins, chaparones are frequently necessary to allow a protein to reach its properly folded or correct three-dimensional state. Chaparones in these instances recognize an incorrectly unfolded protein and provide an energetically favorable pathway, through the hydrolysis of ATP, for the protein to unfold and refold to reach its functional state. Even in these cases, the structure of the protein is dictated by its amino acid sequence.
In principle, it should be possible to deduce the structure of a protein from its amino acid sequence. At this time, it is not possible to perform ab initio structure prediction with any great success. As such protein structure prediction remains one of the major problems in biology. Progress in structure prediction has been made through the combination of sequence and structural similarities. This offers hope that with the knowledge of sufficient structures across a wide range of organisms it should be possible to generate the structure of all unknown proteins. Although there is still much to be learned about protein structure, a series of fundamental features, folding rules, and structural motifs have been observed in many of the threedimensional structures determined to date. These common features arise as a consequence of the amino acids used to build the proteins, the peptide bonds that join the amino acids, and the thermodynamic factors that control protein stability. These common threads in protein structure are described in the following.