Abstract

Predicting functional sites in proteins is important in structural biology for understanding the function and also for structure-based drug design. Here we report a new binding site prediction method PocketDepth, which is geometry based and uses a depth based clustering.Depth is an important parameter considered during protein structure visualisation and analysis but has been used more often intuitively than systematically. Our current implementation of depth reﬂects how central a given subspace is to a putative pocket. We have tested the algorithm against PDBbind, a large curated set of 1091 proteins. A prediction was considered a true-positive if the predicted pocket had at least 10% overlap with the actual ligand. Two diﬀerent parameter sets, ‘deeper’and ‘surface’ were used, for wider coverage of diﬀerent types of binding sites in proteins. With deeper parameters, true-positives were observed for 841 proteins, resulting in a prediction accuracy of 77%,for any ranked prediction. Of these, 55.2% were ﬁrst ranked predictions, whereas 91.2% and 97.4% were covered in the
ﬁrst 5 and 10 ranks, respectively. With the ‘surface’ parameters, a prediction rate of 95.8% was observed, albeit with much poorer ranks. The deeper set identiﬁed pocket boundaries more precisely and yielded better ranks, while the latter missed fewer predictions and hence had better coverage. The two parameter sets were therefore algorithmically combined, resulting in prediction accuracies of 96.5% for any ranked prediction. About 41.8% of these were in the ﬁrst rank, 82% and 94% were in top 5 and 10 ranks, respectively.