Protein Structure Determination
A major requirement for understanding protein structure is a large database of three-dimensional structures. This is particularly important for the comparative method of structure prediction. Although considerable progress has been made in recent years toward establishment of a comprehensive structural database many more protein models are needed before structures can be predicted with a high degree of confidence. There are two methods by which protein structures can be determined: X-ray crystallography and NMR. These techniques are complementary, with each having its advantages for providing information about specific aspects of protein structure. A detailed description of these methods is beyond the scope of this summary, but a few comments are noteworthy.A. X-Ray Crystallography
The first structure of a protein, myoglobin,was determined by X-ray crystallography in 1958 and was followed soon thereafter by the structure of hemoglobin. At that time protein structure determination was a daunting undertaking and few structures were determined in the ensuing years. Fortunately continual developments in the fundamental understanding of X-ray crystallographic theory, data collection, and computational methods have made the determination of protein structure routine. The result of this approach is an electron density map, which is interpreted in terms of a molecular model. The strength of this technique is that it can be applied to any macromolecular assembly that can be crystallized. The overwhelming majority of structures in the protein databank have been determined by X-ray crystallography.
The limiting factor in a successful X-ray structure determination is the growth of high quality crystals. In general if suitable crystals can be obtained a three-dimensional structure will be determined. The final quality of an X-ray structure is directly dependent on the three-dimensional order of the crystals since X-ray crystallography is an imaging technique. This is usually indicated by the “resolution” of the data. Resolution refers to the minimum diffraction spacing included in the structural determination where a smaller the number corresponds to a better structure. Typically a structure at 2.8 Å resolution is satisfactory to determine the path of the polypeptide chain, but data better than 2.5 Å are required to define the hydrogen bonding pattern in a protein with great confidence.
The one concern leveled at X-ray structures is the influence of the crystalline lattice on the observed conformation of the protein. Fortunately it has been demonstrated repeatedly that the structures of proteins observed in crystalline lattice are consistent with most of the biochemical measurements on the same protein. This arises because protein crystals typically contain about 50% solvent such that very little of a protein molecule is in contact with its neighbors in the crystal lattice and the packing forces are thermodynamically small. In some cases proteins are enzymatically active in the lattice. In others conformational changes are observed between the substrate-free and substrate-bound forms of the enzyme. Typically this requires the crystallization of site-directed mutant proteins complexed with the substrate(s) or the study of complexes with substrate analogs. Except for the use of Laue techniques, proteincrystallographyyields a time-averaged view of the protein structure. Careful analysis of accurate X-raydiffractiondatamayprovidesomeindicationof conformational flexibility, but that aspect of protein structure is best suited to spectroscopic techniques such as NMR.
The use of NMR to determine protein structures is a more recent development than X-ray diffraction. It has the advantage that the analysis can be performed in the solution state of the protein which removes any artifacts introduced by crystallization. Its major disadvantage is the size limitation, which restricts most analyses to smaller proteins (< 40 kDa), although it is anticipated that improvements in the technology will extend the size limitation. Structural studies on proteins became possible with the advent of multidimensional NMR techniques. These rely on the use of isotopic labeling with 13C, and 15N and techniques to provide a facile method for assigning all of the 1H resonances in a protein, which would otherwise be a difficult task. The measurement of nuclear Overhauser effect (NOE) intensities provide much of the distance information necessary to derive a structure, although additional chemical shift information is needed for a high-resolution structural determination.
Once a set of distance information has been obtained a series of models are generated and optimized by energy minimization and molecular dynamics within the restraints imposed by the distance information. The advantage of this approach is that it provides structural information on the protein in solution, the drawback is that surface residues and loops appear less well defined because there are generally fewer distance restraints for these components. The great strength of NMR is that it can yield specific information concerning the pKa of an individual group in a protein as well as providing insight into the dynamical properties of the macromolecule.