Ligand binding site superposition and comparison based on Atomic Property Fields: identification of distant homologues, convergent evolution and PDB-wide clustering of binding sites.

Abstract

A new binding site comparison algorithm using optimal superposition of the continuous pharmacophoric property distributions is reported. The method demonstrates high sensitivity in discovering both, distantly homologous and convergent binding sites. Good quality of superposition is also observed on multiple examples. Using the new approach, a measure of site similarity is derived and applied to clustering of ligand binding pockets in PDB.

Example of the tight superposition of similar ligands upon APF superposition of their binding sites: thiamine diphosphate in the binding sites of pyruvate dehydrogenase (1rp7, magenta) and pyruvate decarboxylase (1pvd, green). The resulting RMSD for thiamine ligand in this example is only 0.52A. At the same time even superimposable segments of the receptors’ secondary structure (transparent ribbons) experience much larger displacements. Sequence identity between the two proteins is 19.2%

(a) APF clustering sub-tree containing aspartic proteases. Branches containing only multiple structures of the same protein are collapsed and the number of structures is indicated in square brackets. (b) Superposition of HIV protease and endothiapepsin. Closeup of the binding site reveals correct superposition of the catalytic aspartic acid pair. (c,d) Comparison of binding site pockets in two renin structures (1bil, green, and 2bks, blue), (c); and in chymosin (1czi, magenta) versus renin (1bil, green), (d). Due to alternative side-chain conformations and some backbone movement, very different binding pockets are seen in the two renin structures. The pockets in the chymosin/renin pair overlay much better, which explains why in the clustering tree 1bil and 1czi are adjacent while 2bks is on a relatively remote branch. Pocket blobs were generated using icmPocketFinder[13] and visualized in ICM.

Example of APF superposition of the distantly homologous binding sites: (a) tryptophanyl-tRNA synthase (1i6m, magenta) and pantothenate synthase (1n2g, green). Despite divergent functions, substrates of both enzymes contain adenosyl moiety recognized by relatively conserved motifs. Also of note is the functional mimicry of certain side-chains belonging to different segments of the structure, such as K192 in 1i6m playing the role of K160 in 1n2g, both providing a hydrogen bond to the same nitrogen in adenyl moiety. Overall sequence identity of the two enzymes is 18%. (b) NAD binding site in UDP-galactose 4-epimerase (1ek5, green) and FAD binding site in D-amino acid oxydase (1ve9, magenta). The two enzymes share similar Rossman fold sub-domains binding adenosyl moiety, while their other sub-domains are very different. Parts of well-superimposed ?-?-?-?-? structure can be seen at the bottom of the figure (transparent ribbons).

Examples of convergent binding sites on apparently unrelated enzymes, tightly superimposed by APF method: (a) GDP bound to gdp-mannose mannosyl hydrolase (1rya, magenta) and to calcium-dependent endoplasmic reticulum nucleoside diphosphatase (1s1d, green). Residues coordinating guanidyl moiety – the sidechains of K161, W163, Y237 and the backbone of T164 in 1rya align well in space and play the role of R52, F3, and F9 and the backbone of L4 in 1s1d. 1rya belongs to NUDIX hydrolase superfamily and alpha and beta fold class, while 1s1d is classified as apyrase and 5-bladed beta-propeller, according to CDD[27] and SCOP[28]. (b) Binding sites of concanavalin A (1cjp, magenta) and agglutinin (1jot, green). Despite lack of any overall homology, the two proteins bind the central sugar moieties (glucose in concanavalin A complex and galactose in agglutinin complex) of their ligands in a remarkably similar manner: beta-hairpins G98-L99-Y100 (concanavalin A) and G121-Y122-W123 (agglutinin) coordinate O5’ and O6 atoms via backbone hydrogen bonds; Y12 (concavalin A) and Y78 (agglutinin) engage aliphatic carbons on the opposite face of the sugar ring in hydrophobic interactions; D208 and D125 coordinate hydrogens on O4 and O6 hydroxyl oxygens. Parts of ligands other than the central sugar moiety are shown in wire representation for clarity.