Given a 100% accuracy of CA-trace, what else information a main chain H-bond can give you? I guess only side-chain H-bond prediction is a relevent challenging problem that CASP needs to address this time or in the future.

Following the lead of the excellent assessments by several groups (Baker, Grishin, McGuffin, and Zhang), I'd like to share our preliminary evaluation of CASP8 tertiary structure predictions with the community: http://sysbio.rnet.missouri.edu/casp8_eva/index.html

Even if HB is considered in evaluation, it shouldn't have the same weight as GDT score or TMscore.If the whole model is wrong, does it make sense to have good HB?

GDT_HA may be a suitable assessment score.

Given a 100% accuracy of CA-trace, what else information a main chain H-bond can give you? I guess only side-chain H-bond prediction is a relevent challenging problem that CASP needs to address this time or in the future.

I will leave the job of assigning target proteins into HA-TBM/TBM/FM to the CASP8 assessors.But, I ask the assessors to consider more microscopic criteria when evaluating TBM models.Subtle differences are likely to be buried by coarse-grained measurements and will be revealed only by fine-grained criteria.

Well, the definition of TBM and FM are subjective instead of objective. How to implement what you suggested without introducing too much artificial bias?In addition, GDT-TL may be too strict and is likely to bury some subtle difference.

In the CASP7, to assess the quality of C_alpha trace, GDT-HA was used for both TBM and HA-TBM targets.I would like to see the CASP8 assessors to use a even higher-accuracy measure such as GDT-TL (0.25, 0.5 1.0 2.0) especially for HA-TBM tagets as done in the CAST6. Nowadays, protein model quality is improving steadily especially for TBM targets, and CASP should ask/encourage predictors to devise more accurate modeling globally (for FM targets) as well as locally (for TBM targets). I feel like 8A is too large a distance to bemeaningful even for FM targets (however, 8A gives us a complacent feeling of good protein modeling)

On the other hand, for the calculation of GDT scores, only positions of C_alpha atoms matter.Since there are many more non-C_alpha atoms in protein models (CASP8 did not accept C_alpha only models),CASP8 assessors should consider to include additional measures other than the HB score used in the CASP7.Candidate measures include Chi_1 and Chi_12 for all/TBM targets. One should also consider to use the HB score for all targets not restricted to TBM.

Even if HB is considered in evaluation, it shouldn't have the same weight as GDT score or TMscore.If the whole model is wrong, does it make sense to have good HB?

GDT_HA may be a suitable assessment score.

Given a 100% accuracy of CA-trace, what else information a main chain H-bond can give you? I guess only side-chain H-bond prediction is a relevent challenging problem that CASP needs to address this time or in the future.

If the whole model is wrong, it's HB score will be practically ZERO.

In my opinion, the whole point should be that (1) proteins contain much more atoms other thanthe C_alpha atom (2) GDT scores depend only on the positions of C_alpha atoms. I recomment that you read the CASP7 TBM assessment paper, where you will find a figure illustrating the difference between a good GDT model and a good HB model.

Even if HB is considered in evaluation, it shouldn't have the same weight as GDT score or TMscore.If the whole model is wrong, does it make sense to have good HB?

GDT_HA may be a suitable assessment score.

Given a 100% accuracy of CA-trace, what else information a main chain H-bond can give you? I guess only side-chain H-bond prediction is a relevent challenging problem that CASP needs to address this time or in the future.

If the whole model is wrong, it's HB score will be practically ZERO.

In my opinion, the whole point should be that (1) proteins contain much more atoms other thanthe C_alpha atom (2) GDT scores depend only on the positions of C_alpha atoms. I recomment that you read the CASP7 TBM assessment paper, where you will find a figure illustrating the difference between a good GDT model and a good HB model.

It is not true that "If the whole model is wrong, its HB score will be practically ZERO."

For an alpha-protein, it is easy to construct a model with a completely wrong topology but with a good HB-score or chi1/chi2 score. T0465_LEE-SERVER_TS1is one such example: TM-score of this model is 0.199, GDT-score is 0.180 and RMSD=17.9A, which are all close to random; but this model gets 50% of H-bonds correct with a HB-score higher than most of others. I am not sure this kind of HB-score is very meaningful.

If HB-score is combined with GDT/TM-score, it should have a lighter weight, e.g.TM-score+HB-score*10%.

I'm not sure I care for this recent fad of trying to use hydrogen bonds for model assessment.

It's such a comprehensively flawed concept, that I'm amazed we are still discussing it - but here aresome pertinent comments:

1. As someone has already pointed out, it is only useful for beta sheets - zero usefulness for all-alpha proteins. Evenin beta sheets it's no use for simple beta meanders where the same hydrogen bond pattern can be observed acrossa wide range of sheet curvatures. Why use a method which can only be applied to a subset of protein fold types?The argument should really just finish there, but to continue...

2. Hydrogen bonding is a complex quantum mechanical phenomenon - any purely geometric definition of a hydrogenbond will be a crude approximation. Assuming we are not going to do semi-empirical quantum calculations, for example, whichcrude approximation of a hydrogen bond do we opt to use? The old distance-based DSSP definition? Baker and Hubbard?Dreiding/CHARMm potential? What cutoff do we set for the minimum energy permissible for a hydrogen bond? What aboutsteric hindrance, bifurcation or competition with surrounding solvent in accessible areas of the model?

3. What's so special about hydrogen bonds anyway? Why not also look at the similarity of accessible atomic surface area and that waytake the non-polar parts of the model into account? That could even be applied to all protein fold classes - not that I'm seriouslyrecommending this criterion, I hasten to add!

4. The only reason these hydrogen bond evaluation schemes have any perceived value is that they encompass geometric informationbeyond the C-alpha trace. It's plainly daft to evaluate high resolution models on just C-alpha positions but why not just address that issuedirectly rather than adding the fuzziness of hydrogen bond definitions into the mix? Use main chain RMSDs or even all-atom RMSDs if you wantmore resolution than C-alphas can provide. A main chain atom RMSD of zero will by definition produce exactly the same main chain hydrogen bond list between two models (using simple geometric HB definitions at least). A C-alpha RMSD of zero will not necessarily produce the same main chain hydrogen bond list due to the inaccuracy inherent in building main chain coordinates from C-alpha traces.

In my view we should be replacing GDT-HA with geometric definitions based on both main chain and side chain atom distances not mixtures of C-alpha metrics combined with arbitrary hydrogen bond definitions.

This would produce a score that gives some credit for basic alignment accuracy (the C-alpha components), some creditfor main chain geometry (including main chain hydrogen bonds) and the last bit of credit for putting the side chain atoms in theright places (which will even include side chain hydrogen bonding). Of course the selection of terms and distance-cutoffs is something thatcould (and no doubt should) be tuned.

I'm not sure I care for this recent fad of trying to use hydrogen bonds for model assessment.

It's such a comprehensively flawed concept, that I'm amazed we are still discussing it - but here aresome pertinent comments:

1. As someone has already pointed out, it is only useful for beta sheets - zero usefulness for all-alpha proteins. Evenin beta sheets it's no use for simple beta meanders where the same hydrogen bond pattern can be observed acrossa wide range of sheet curvatures. Why use a method which can only be applied to a subset of protein fold types?The argument should really just finish there, but to continue...

2. Hydrogen bonding is a complex quantum mechanical phenomenon - any purely geometric definition of a hydrogenbond will be a crude approximation. Assuming we are not going to do semi-empirical quantum calculations, for example, whichcrude approximation of a hydrogen bond do we opt to use? The old distance-based DSSP definition? Baker and Hubbard?Dreiding/CHARMm potential? What cutoff do we set for the minimum energy permissible for a hydrogen bond? What aboutsteric hindrance, bifurcation or competition with surrounding solvent in accessible areas of the model?

3. What's so special about hydrogen bonds anyway? Why not also look at the similarity of accessible atomic surface area and that waytake the non-polar parts of the model into account? That could even be applied to all protein fold classes - not that I'm seriouslyrecommending this criterion, I hasten to add!

4. The only reason these hydrogen bond evaluation schemes have any perceived value is that they encompass geometric informationbeyond the C-alpha trace. It's plainly daft to evaluate high resolution models on just C-alpha positions but why not just address that issuedirectly rather than adding the fuzziness of hydrogen bond definitions into the mix? Use main chain RMSDs or even all-atom RMSDs if you wantmore resolution than C-alphas can provide. A main chain atom RMSD of zero will by definition produce exactly the same main chain hydrogen bond list between two models (using simple geometric HB definitions at least). A C-alpha RMSD of zero will not necessarily produce the same main chain hydrogen bond list due to the inaccuracy inherent in building main chain coordinates from C-alpha traces.

In my view we should be replacing GDT-HA with geometric definitions based on both main chain and side chain atom distances not mixtures of C-alpha metrics combined with arbitrary hydrogen bond definitions.

This would produce a score that gives some credit for basic alignment accuracy (the C-alpha components), some creditfor main chain geometry (including main chain hydrogen bonds) and the last bit of credit for putting the side chain atoms in theright places (which will even include side chain hydrogen bonding). Of course the selection of terms and distance-cutoffs is something thatcould (and no doubt should) be tuned.

Good idea. But the cutoffs of 2/1/0.5 are too small. Credit should also be given to those with an error of 3A-4A or even 5A for the TBM models, because they are indeed different from an error of 6A or 8A. In your equation, errors in the region of 0.5A is over-counted. No matter you count C-alpha or main-chain or side-chain atoms, they are highly corrected.

I like Zhang's assessment by TM_score and HB_score (perhaps because it puts my server second, right behind Zhang's). It seems that Zhang fixed the problem in CASP7 of bad models built on good CA traces, if he is now doing best at the Hbonds.

I think HB_score should be used only on prediction of a "new category", say "H-bond prediction" , just like side-chain modeling, not on traditional 3D-structure prediction which most focused on.

The CASP7 assessors ranked groups by HB and GDT. It's time for CASPs to set up a somewhat consistent criterion.

I disagree—I think we need new assessment methods that better distinguish good models from great models. GDT is a fine measure for the template-free models and for models that are not so great, but once the models start getting good (GDT > 85%, say) then ranking just based on the CA trace is sort of stupid. Getting the model right in traditional 3D modeling is the goal, and getting it right is not just getting the CA atoms in roughly the right places.

Correctness of hydrogen bonds is one measure that helps distinguish among good models. Other measures (all-atom RMSD, chi1 correctness, ... ) can also be applied. My former student, Firas Khatib, has come up with some new topological measures (based on slip knots) that are almost completely orthogonal to the GDT measure, but which usually distinguish experimental models from CASP models. They measure a property that none of us are getting right yet, but which is invisible to GDT. (Sorry, he hasn't made a program that can be used by any one but him yet—put pressure on David Baker, who has hired Firas as a postdoc.)