Folding time and inaccurate energy estimates: Two bottlenecks for ab initio prediction of protein structures

Abstract

There are two main obstacles for ab initio protein structure prediction: (i) the impossibility to enumerate and sort out all the possible protein folds and (ii) insufficient accuracy of energy estimates used to single out the most stable structure. We show that the first of them is not as crucial as it is often thought to be, and a chain of ~200 residues has to find its lowest-energy fold within minutes, provided it has a sufficient energy gap between the most stable and all the misfolded structures. Thus, finding of the lowest-energy fold is a feasible task, provided there is a sufficient energy gap between the most stable and all the misfolded structures. On the contrary, the insufficient accuracy of energy estimates, which is often assumed as a minor obstacle, is fatal for protein structure prediction: even a rather small error in these estimates destroys (in any computation) the predominant stability of that fold, which actually has the lowest energy, and than this fold of protein chain cannot be singled out even by an exhaustive enumeration. The latter is demonstrated by a drift of protein chain fold from its native structure in numerous molecular dynamics simulations. However, the averaging of each fold's energy over remote homologs having the same native fold can serve as a kind of remedy against errors in energy estimates: it may help to establish the common native fold of homologs even when this is impossible to do for each of them separately.