Wiki

Function

Draw a White-Wimley protein hydropathy plot

Description

octanol draws a hydropathy plot for an input protein sequence. This plots the free energy difference calculated for windows over the protein sequence, of the residues in water compared to two lipid environments: i. Octanol (equivalent to inside a lipid bilayer). ii. The interface of a synthetic lipid bilayer. Free energy differences are calculated for each position in a window of 19 residues by default, about the size of a membrane spanning alpha-helix. The energy values for each residue are summed to get two values for each window. By default, the value plotted is the free energy difference between the interface and octanol environments, which is the best indicator of the location of probable transmembrane regions. Command line options allow the display of the octanol and interface values, or hiding the difference values. The experimental free energy values for the water-interface and water-octanol transitions are read from a datafile (Ewhite-wimley.dat)

Read first file from standard input, write first file to standard output

Boolean value Yes/No

N

-options

boolean

Prompt for standard and additional values

Boolean value Yes/No

N

-debug

boolean

Write debug output to program.dbg

Boolean value Yes/No

N

-verbose

boolean

Report some/full command line options

Boolean value Yes/No

Y

-help

boolean

Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose

Boolean value Yes/No

N

-warning

boolean

Report warnings

Boolean value Yes/No

Y

-error

boolean

Report errors

Boolean value Yes/No

Y

-fatal

boolean

Report fatal errors

Boolean value Yes/No

Y

-die

boolean

Report dying program messages

Boolean value Yes/No

Y

-version

boolean

Report version number and exit

Boolean value Yes/No

N

Input file format

octanol reads a single protein sequence.

The input is a standard EMBOSS sequence query (also known as a 'USA').

Major sequence database sources defined as standard in EMBOSS
installations include srs:embl, srs:uniprot and ensembl

Data can also be read from sequence output in any supported format
written by an EMBOSS or third-party application.

The input format can be specified by using the
command-line qualifier -sformat xxx, where 'xxx' is replaced
by the name of the required format. The available format names are:
gff (gff3), gff2, embl (em), genbank (gb, refseq), ddbj, refseqp, pir
(nbrf), swissprot (swiss, sw), dasgff and debug.

Output file format

The output is to the specified graphics device.

The results can be output in one of several formats by using the
command-line qualifier -graph xxx, where 'xxx' is replaced by
the name of the required device. Support depends on the availability
of third-party software packages.

octanol draws a graph showing the
free energy calcuated over a sliding window.

Output files for usage example

Graphics File: octanol.ps

The line on the default plot is the difference between the interface
and octanol free energy calculations. Command line options allow the
display of the interface and octanol values, or hiding the difference
values.

In the example, the human opsin protein has 7 transmembrane
regions: 37-61, 74-98, 114-133, 153-176, 203-230, 253-276 and 285-309.
Each is about 20 residues in length, which is also the gap between tick
marks on the sequence axis. All have energetic preferences for being
in the lipid (octanol) enviroment - shown as being above the zero
line - or have at least no clear preference.

For those regions where the diference plot is close to zero, both the
other two plots are above the line, showing a preference for either
the octanol or the interface membrane environments rather than water.

Data files

EMBOSS data files are distributed with the application and stored
in the standard EMBOSS data directory, which is defined
by the EMBOSS environment variable EMBOSS_DATA.

To see the available EMBOSS data files, run:

% embossdata -showall

To fetch one of the data files (for example 'Exxx.dat') into your
current directory for you to inspect or modify, run:

% embossdata -fetch -file Exxx.dat

Users can provide their own data files in their own directories.
Project specific files can be put in the current directory, or for
tidier directory listings in a subdirectory called
".embossdata". Files for all EMBOSS runs can be put in the user's home
directory, or again in a subdirectory called ".embossdata".

The directories are searched in the following order:

. (your current directory)

.embossdata (under your current directory)

~/ (your home directory)

~/.embossdata

Notes

Protein sequences that form transmembrane regions are assumed to have a thermodynamic preference for a hydrophobic environment (inside the membrane lipid bilayer), rather than an aqueous environment in water. The free energy change for each amino acid residue between a lipid and a water environment can be measured experimentally, and the values for peptides can be shown to be additive (White and Wimley 1999).

For each amino acid residue in the protein, the free energy difference of the residue in lipid and water environments is measured in two ways. The first is the free energy difference between the protein in water and the protein associated with the interface (glycerol group) of a POPC (palmitoyloleoylphosphocholine) bilayer. The second is the free energy difference of the protein in water and the protein in octanol, equivalent to the environment inside a lipid bilayer.

Residues which can be buried inside a lipid bilayer must be in a region of the peptide where most residues show a free energy difference in favour of being in an octanol environment or at least being in the lipid/water interface region. White and Wimley (1999) showed that a sliding window of either free energy difference will indicate the location of probable transmembrane regions, but that the best indicator is the difference between the two values, which is the free energy difference between the interface and octanol environments.