Wikiomics:Protein volume

From OpenWetWare

Since there are several ways to represent a protein structure, its volume can be defined in various ways, depending on what we expect from it. In any case, it requires the definition of an envelope that will be derived from the 3D atomic structure of the protein.

Contents

Definitions

SAS and SES

Solvent-accessible surface (SAS) and solvent-excluded surface (SES) are related definitions of envelopes that can be defined for a set of spheres of various radii. These spheres usually represent the non-hydrogen atoms of the protein, in a structure where water, ions and ligands can be ignored, at least if they are considered as part of the solvent.

SAS [1] and SES [2, 3, 4] are defined by all the possible contacts between a spheric probe of a given radius, and the set of spheres which represent the protein atoms. The spheric probe is usually given a radius of 1.4 angstroms, and is meant to be an approximation of a molecule of water rolling on the protein. The SES delimits the volume that cannot be reached by the spheric probe (figure needed), while the SAS is defined by the set of positions of the center of the spheric probe. The SES is the surface which is most commonly used in molecular graphics, since it renders the notion of shape complementarity between the protein and other molecules pretty well. The SAS is equivalent to the van der Waals surface, where van der Waals radii of each atom has been augmented by 1.4 or whatever the radius of the probe sphere is.

Other names for SES include molecular surface and Connolly surface named after Michael Connolly who first proposed an algorithm [3] to compute it.

Contact surface

Contact area should not be confused with SES (molecular surface). The contact surface is the part of the SES which touches atoms directly. It is therefore not continuous and there is no such thing as a "contact volume".

Comparison of SES, SAS and contact surface

the SAS volume is larger than the SES volume

the contact surface is included in the SES, so its area is smaller.

Cavities

Some proteins may contain extra volume which is not filled by protein atoms. Whether this is due to a problem in the protein model or a pocket which is filled with water, its surface might be taken into account by the surface detection algorithm. This may or may not be wanted, depending on the situation. It is therefore important to be aware of that and to know what a given program actually does.

Tessellations

A tessellation consists in dividing the space into cells, with each cell containing one input point. Voronoi tessellations [5] and variants of them allow a "fair" assignment of space around points. The volume of these cells can be used to describe how much volume each atom occupies.

However, a tessellation will result in some cells being necessarily infinite, while some others might have very large sizes. It usable though, if the solvent around the protein model is fully represented. In this case, all protein atoms will be represented in a meaningful environment. The sum of the volume of the cells can then constitute an elegant estimation of the protein volume (see for example [6]). The bad news is that experimental structures usually don't come with a layer of water molecules which fully immerses the protein. Water molecules must therefore be simulated, which is somewhat less reliable.

This kind of global volume computation on a protein which is not immersed in solvent is meaningless.

Tools

Note: algorithms to compute SAS areas and volumes are much simpler than those for SES, however SES is really what most people want, see above.

Libraries

Web applications

Standalone programs

MSMS is a command-line executable which performs pure geometric computations of molecular surfaces (SES). It returns area, volume, and full triangulation if wanted. The position and radius of each atom must be specified.