I knew that there is a 2007 PNAS paper talking about two proteins similar to T0498 and T0499.It is amazing that there are still four servers (from three different groups) that can guess T0498's fold correctly.

guest wrote:I knew that there is a 2007 PNAS paper talking about two proteins similar to T0498 and T0499.It is amazing that there are still four servers (from three different groups) that can guess T0498's fold correctly.

Structures of these two proteins have been clearly plotted in this paper which provide plenty of information to model both proteins. This may be unfair for those predictors who did not read this paper. Therefore, these two targets should be removed from the final evaluation.

Guest wrote:Structures of these two proteins have been clearly plotted in this paper which provide plenty of information to model both proteins. This may be unfair for those predictors who did not read this paper. Therefore, these two targets should be removed from the final evaluation.

Maybe the original poster knows something the rest of us do not, but there have beenseveral other successful redesign experiments on Protein G - all of which involve small numbers ofmutations. It was fairly obvious that T0498/T0499 were protein design pairs but most of the workon this protein has been in designing domain-swapped multimeric variants of the protein. Giventhat the structures of T0498 and T0499 have not yet been released, as far as I know, we can't knowfor certain which particular fold variant of Protein G this is. Even with that paper in hand, guessingthat T0498 folds like the first structure in the Alexander et al. paper may or may not be correct - itcould just as easily be a domain-swapped variant of the wild type protein - or maybe something elseentirely!

Nevertheless, I do think it would be more useful for these CASP "trick questions" (there have been othersin previous CASPs) to be posed a little more clearly. I guess it's part of the challenge - but more peoplemight have attempted a non-template-based prediction if it was clearly stated that these proteinshad been designed to have different folds, and everyone would have know to look at the literature moreclosely.

I think it is totally fine to use T0498 and T0499 as human targets, but it is not very fair to use them as server targets.

Guest wrote:

Guest wrote:Structures of these two proteins have been clearly plotted in this paper which provide plenty of information to model both proteins. This may be unfair for those predictors who did not read this paper. Therefore, these two targets should be removed from the final evaluation.

Maybe the original poster knows something the rest of us do not, but there have beenseveral other successful redesign experiments on Protein G - all of which involve small numbers ofmutations. It was fairly obvious that T0498/T0499 were protein design pairs but most of the workon this protein has been in designing domain-swapped multimeric variants of the protein. Giventhat the structures of T0498 and T0499 have not yet been released, as far as I know, we can't knowfor certain which particular fold variant of Protein G this is. Even with that paper in hand, guessingthat T0498 folds like the first structure in the Alexander et al. paper may or may not be correct - itcould just as easily be a domain-swapped variant of the wild type protein - or maybe something elseentirely!

Nevertheless, I do think it would be more useful for these CASP "trick questions" (there have been othersin previous CASPs) to be posed a little more clearly. I guess it's part of the challenge - but more peoplemight have attempted a non-template-based prediction if it was clearly stated that these proteinshad been designed to have different folds, and everyone would have know to look at the literature moreclosely.

guest wrote:I think it is totally fine to use T0498 and T0499 as human targets, but it is not very fair to use them as server targets.

See your point, but in theory a really clever server could have selected the correct template (assuming the particularfold is in PDB) or worked out that the mutations were incompatible with the "obvious" fold. I suspect there are no currentservers with that level of sophistication but I see no harm in giving it a try. It'll be up to the assessors whether they includethis data when assessing the server predictions - I suspect it won't matter much either way.

The structures of T0498/T0499 have actually been released (2jws/2jwu, with 2/4 residuesdifferent). The structure of T0498 is exactly the same as the first structure in Figure 2 of the paper and T0499 as the second.

At the current stage, I do not think (hope I am wrong) there is any algorithm that can fold T0498 correctly without reading the paper or having other information (given the fact that there are a number of 'wrong' templates with 60% of sequence identity to this target). It is fine to put them in CASP. But if the structural pictures are published before casp and not all the predictors have read the paper during their prediction, it will resultin unfairness here. I suggest T0498 should be dropped off but T0499 may be kept becauseat this level of homology and accuracy, the picture does not help anyway.

Last year or earlier in this year I did an experiment on the two proteins described in the paper by running my threading program on them. It turned out that the two different folds appeared among top 20 templates. However, it is very difficult to have a correct guess of the top 1 models even if I used some energy functions or model quality assessment programs to help me. T0498 and T0499 are only different in three positions and my program just generated similar folds as their top 1 models.

Guest wrote:The structures of T0498/T0499 have actually been released (2jws/2jwu, with 2/4 residuesdifferent). The structure of T0498 is exactly the same as the first structure in Figure 2 of the paper and T0499 as the second.

At the current stage, I do not think (hope I am wrong) there is any algorithm that can fold T0498 correctly without reading the paper or having other information (given the fact that there are a number of 'wrong' templates with 60% of sequence identity to this target). It is fine to put them in CASP. But if the structural pictures are published before casp and not all the predictors have read the paper during their prediction, it will resultin unfairness here. I suggest T0498 should be dropped off but T0499 may be kept becauseat this level of homology and accuracy, the picture does not help anyway.

Guest wrote:The structures of T0498/T0499 have actually been released (2jws/2jwu, with 2/4 residuesdifferent). The structure of T0498 is exactly the same as the first structure in Figure 2 of the paper and T0499 as the second.

Well, it's no surprise that they have the same structures as those seen in the paper - those ARE the structuresfrom the paper! They are not T0498 and T0499 however. Maybe T0498 and T0499 do adopt those same structures,but as it only takes 2-4 residues to change the overall fold of these proteins we can't say for sure until wesee the structure solutions for the exact T0498 and T0499 sequences.