A Crescendo of Protein Structures

By Katharine Miller

A ten-year, $600-million program known as the Protein Structure Initiative (PSI) has already, in its five year pilot phase, greatly increased the speed at which protein structures can be determined, and added 1100 structures to the Protein Data Bank (PDB). Several thousand more may be added over the next five years. Completion of the project should lead to more rapid determination of protein function.

“The key is to make protein structures useful by getting them out there and in the hands of scientists all over,” says John Norvell, director of the PSI at the National Institute of General Medical Sciences (NIGMS), which funds the project. “Lots of interesting science will come from this large collection. It will allow people to think in structural ways when designing experiments or hypotheses. It will permit better attack on protein-folding problems. And it will lead to better and quicker work on target drug designs.”

A few thousand protein structures might not sound like a lot, given that the PDB—a federal repository for structural information about proteins— already contains about 30,000 structures. But the large majority of the banked structures are closely related to one another.

According to Jerry Li, MD, PhD, program director at the Center for Bioinformatics & Computational Biology at the NIGMS, “We really have only a few thousand structures that are relatively unique,” says Li. “We need a whole lot of structures that are not so homologous to each other.”

That’s why the PSI targets representatives of a wide range of protein families. As a result, the PSI is producing a catalog of structural information not only about a large number of proteins but about a larger variety of proteins than had previously been examined.

For 50 years, scientists have been determining the structure of proteins in order to better understand their function, but the PSI marks a shift in how structural biology is done. “The PSI is discovery-driven rather than hypothesis-driven.” Norvell says. “We’re systematically sampling the universe of protein structures.”

PSI’s efforts have also reduced the cost of determining protein structures, from $420,000 per protein down to about $125,000. Norvell hopes to reduce the cost even further to under $100,000 or even as low as $50,000.

The program is now moving into its second phase, with plans to identify more protein structures in two ways— in the lab and in silico. Under one set of grants, production centers will be established to elucidate 4000 or more additional protein structures over the next five years. Meanwhile, another set of grants will focus on improving methods for computational modeling of protein structures. The shapes of protein family representatives (PSI’s experimental targets) serve as rough templates for the other structures in the family, which will be determined using computer-based homology modeling.

“In the end,” says Li, “the PSI will generate a few thousand experimental structures, but it will produce tens of thousands of modeled structures.”