The Delaunay tessellation of a protein allows the identification of all four nearest neighbor residues (quadruplets),
where each residue is represented by the position of its center of mass (determined from the 3D coordinates extracted from
a PDB structure file). Delaunay tessellations represent an objective and unambiguous definition of nearest residue neighbors
and provide a framework for calculating empirical potentials.

The amino acid composition of quadruplets is evaluated to analyze which clusters of four amino acids tend to be close together
in folded protein structures. The log-likelihood of the quadruplets derived from a Delaunay Tessellation were calculated from a
training set of 1417 protein structures solved via x-ray crystallography (extracted from PDB), with low sequence homology
and high resolution. Previous analyses of the distribution of
these log likelihood scores showed that it is is non random. Quadruplets with highest log likelihoods contain cysteines, which are
structurally important as they form sulfur bridges and they are involved in metal binding motifs [Vaisman II, 1998].

Potential scores for each residue are calculated by summing up the log-likelihoods of the quadruplets in which that residue
participates, while potential scores for the entire protein are computed by taking the sum of the log-likelihoods of all
quadruplets found in the protein. These scores were used for the identification of tertiary packing motifs and functional
signature motifs common to structures belonging to the same protein family [Tropsha, et al., 2003]. Furthermore, the distribution
of the residue potential scores indicates that low scores are associated with surface residues, which have less structural
neighbors, and high scores are associated with residues in the hydrophobic core, which are structurally important for maintaining
the conformation of a protein [Carter, et al., 2001; Masso and Vaisman, 2003]. Potential score profiles for a protein are
also derived by constructing a vector with N elements, where N is the number of amino acids in a given protein. Each element in
the vector represents the potential score of the corresponding residue.

To simulate the introduction of a missense mutation, the amino acid letter code is modified at the appropriate residue position,
which changes the amino acid composition of the quadruplets in which it participates. The potential score for the mutant is
calculated using the Delaunay tessellation of the wild-type structure with the modified primary sequence. Potential score
differences, called Residual Scores (RS) are then calculated, for each residue or for the entire protein, by subtracting
the potential score of mutants from that of the wild-type. RS has been shown to correlate well with the stability of mutants
[Carter, et al., 2001] and have also been successfully applied to classify non-synonymous SNPs as disease-associated
or not [Barenboim, et al., 2005]. Similar to potential profiles, Residual Score Profiles (RSP)
are vectors of N (number of amino acids in a given protein) elements, each representing the residual score for every amino acid
in the protein under study.