A major requirement for understanding protein structure
is a large database of three-dimensional structures. This
is particularly important for the comparative method of
structure prediction. Although considerable progress has
been made in recent years toward establishment of a comprehensive
structural database many more protein models
are needed before structures can be predicted with a high
degree of confidence. There are two methods by which
protein structures can be determined: X-ray crystallography
and NMR. These techniques are complementary,
with each having its advantages for providing information
about specific aspects of protein structure. A detailed
description of these methods is beyond the scope of this
summary, but a few comments are noteworthy.

A. X-Ray Crystallography
The first structure of a protein, myoglobin,was determined
by X-ray crystallography in 1958 and was followed soon
thereafter by the structure of hemoglobin. At that time
protein structure determination was a daunting undertaking
and few structures were determined in the ensuing
years. Fortunately continual developments in the fundamental
understanding of X-ray crystallographic theory,
data collection, and computational methods have made
the determination of protein structure routine. The result
of this approach is an electron density map, which is interpreted
in terms of a molecular model. The strength of
this technique is that it can be applied to any macromolecular
assembly that can be crystallized. The overwhelming
majority of structures in the protein databank have been
determined by X-ray crystallography.

The limiting factor in a successful X-ray structure determination
is the growth of high quality crystals. In general
if suitable crystals can be obtained a three-dimensional
structure will be determined. The final quality of an X-ray
structure is directly dependent on the three-dimensional
order of the crystals since X-ray crystallography is an
imaging technique. This is usually indicated by the “resolution”
of the data. Resolution refers to the minimum
diffraction spacing included in the structural determination
where a smaller the number corresponds to a better
structure. Typically a structure at 2.8 Å resolution is satisfactory
to determine the path of the polypeptide chain, but
data better than 2.5 Å are required to define the hydrogen
bonding pattern in a protein with great confidence.

The one concern leveled at X-ray structures is the influence
of the crystalline lattice on the observed conformation
of the protein. Fortunately it has been demonstrated
repeatedly that the structures of proteins observed in
crystalline lattice are consistent with most of the biochemical
measurements on the same protein. This arises because
protein crystals typically contain about 50% solvent such
that very little of a protein molecule is in contact with
its neighbors in the crystal lattice and the packing forces
are thermodynamically small. In some cases proteins are
enzymatically active in the lattice. In others conformational
changes are observed between the substrate-free
and substrate-bound forms of the enzyme. Typically this
requires the crystallization of site-directed mutant proteins
complexed with the substrate(s) or the study of complexes
with substrate analogs. Except for the use of Laue
techniques, proteincrystallographyyields a time-averaged
view of the protein structure. Careful analysis of accurate
X-raydiffractiondatamayprovidesomeindicationof conformational
flexibility, but that aspect of protein structure
is best suited to spectroscopic techniques such as NMR.

The use of NMR to determine protein structures is a more
recent development than X-ray diffraction. It has the advantage
that the analysis can be performed in the solution
state of the protein which removes any artifacts introduced
by crystallization. Its major disadvantage is the size limitation,
which restricts most analyses to smaller proteins
(< 40 kDa), although it is anticipated that improvements
in the technology will extend the size limitation.
Structural studies on proteins became possible with the
advent of multidimensional NMR techniques. These rely
on the use of isotopic labeling with 13C, and 15N and techniques
to provide a facile method for assigning all of the 1H
resonances in a protein, which would otherwise be a difficult
task. The measurement of nuclear Overhauser effect
(NOE) intensities provide much of the distance information
necessary to derive a structure, although additional
chemical shift information is needed for a high-resolution
structural determination.

Once a set of distance information has been obtained
a series of models are generated and optimized by energy
minimization and molecular dynamics within the
restraints imposed by the distance information. The advantage
of this approach is that it provides structural information
on the protein in solution, the drawback is that
surface residues and loops appear less well defined because
there are generally fewer distance restraints for these
components. The great strength of NMR is that it can yield
specific information concerning the pKa of an individual
group in a protein as well as providing insight into the
dynamical properties of the macromolecule.