Protein Structure

Secondary structure prediction

Linus Pauling [1] already suggested that amino acid chains could assume regular
local structures, namely alpha helices and beta strands. In between these
secondary structure elements there are turns or loops.
There is a long tradition of attempts to predict local secondary structure
based on sequence. State-of-the-art secondary structure prediction generally
observes the frequencies of occurences of k-tuples in particular secondary structures.
Based on this statistic prediction can be made for a new sequence.

Chou and Fasman [6] apply a basic log-odds approach for
the occurences of single amino acid residues in the sequence,
while the GOR method [7] which is based on information theory
uses all possible pair frequencies within a sliding window.

As long as one restricts to the problem to the prediction for a single sequence
there seems to be an inherent limit in prediction accuracy of around 65%.
Multiply aligned sequences offer a means to surpass this limit.
The PHD-method [8]
uses evolutionary information from multiple sequence alignments in
a multi level system of neural networks.
Due to the auuthors, the average accuracy of PHD-method is greater than 72%.

There is a fairly well-accpted method of validating a new secondary structure prediction methods.
Most authors of methods therefore report the success rate of their procedure.
The GOR- and the PHD-method, additionally, supply the user with an estimate of how reliable a prediction
in a particular area is. An obvious approach would be to overlay the output from
several secondary structure prediction programs but it is doubtful whether
this strategy will actually improve the situation. Only if the individual
methods are sufficiently different does one actually gain information through
such an approach.