How do I Reference ProFit?

No paper has been published describing ProFit itself since it is simply
a convenient program (I hope) to let you use a standard fitting
algorithm; consequently, it is a little difficult to reference. The
exact wording is up to you and dependent on the context, but I suggest
something similar to:

Fitting was performed using the McLachlan algorithm (McLachlan, A.D.,
1982 ``Rapid Comparison of Protein Structres'', Acta Cryst A38,
871-873) as implemented in the program ProFit (Martin, A.C.R.,
http://www.bioinf.org.uk/software/profit/).

It is not possible to compare regions of different length! The
definition of an RMSD requires that there be a 1:1 equivalence between
atoms. A least-squares fitting program like ProFit minimizes the RMSD
between two sructures. Therefore, you must somehow decide which
residues are equivalent to one another.

You can do this using multiple ZONE commands to specify which part
of one structure is equivalent to which part of the other
structure.

Alternatively, ProFit allows you to use a sequence alignment to
specify the equivalent zones. You can do this within ProFit using the
ALIGN command, or you can read a pre-calculated alignment using the
READALIGNMENT command.

Ideally (and especially with more diverged sequences) you need to
perform a structural alignment (using a program such as SSAP)
rather than a sequence alignment to define the zones for least-squares
fitting. You could use such a program to generate a structure-based
sequence alignment and read that into ProFit.

However, ProFit also has the ability to work out the best
equivalences once you have given it a seed set. Typically you start
with a sequence alignment between the two structrues to obtain a
starting point for equivalent pairs of atoms. You can then use the
ITERATE command to get ProFit to refine the equivalences which it does
using a dynamic programming algorithm. Note that you may need to play
with the cutoff distance (given as a parameter to the ITERATE
command). We intend to optimize this, but as a general guide you use a
value that is 1-2A larger than the RMSD that you obtain from the
initial fit based on the sequence alignment alone.

One of the most common questions I get about ProFit is something
along the lines of 'I have lots of pairs of proteins I need to fit or
one protein that needs to be fitted to lots of others. How can I get
ProFit to process all of these?'

As of ProFit V3.0, there are two ways to do this.

The first method is the SCRIPT command introduced in
ProFit V3.0 which allows you to read in and execute a script.

The second method has always been possible and relies on just using
Unix-style redirection. Just place all the commands you wish to run in
a text file and then run ProFit, redirecting standard input to this
file.

For example, suppose you wanted to use a.pdb as a
reference structure and wanted to fit it with b.pdb,
c.pdb, d.pdb, e.pdb and
f.pdb. Just create a text file as follows:

As of ProFit V3.1, Windows is officially supported. We now provide
a precompiled Windows version, and the source code should compile
cleanly with Windows compilers (we use mingw, the 'Miniature GNU for
Windows' environment).

Under Windows, ProFit has no graphical interface: you must still
learn to use the commands as described in the manual. With Windows XP
(and maybe earlier versions), the double-clicking the ProFit icon will
open a command window where you can type commands. You can also start
ProFit from a MS-DOS command shell. Go to the directory where you
unpacked ProFit and type the command:

This is a known weird-and-wonderful bug in the McLachlan fitting
algorithm. Our analysis of this problem shows that the fitting of
identical structures can hit a saddle point during the minimisation
which the algorithm thinks is convergence leading to the structures
being fitted 180degrees away from the correct position. We have spent
a lot of time trying to find a proper fix for this, so far without
success.

As explained in the INSTALL file, with some versions of GCC,
compiling with optimization on (-O3) seems to hide the bug.
Alternatively, a workaround is provided by editing the Makefile and
uncommenting the line:

ROTATEREFIT = -DROTATE_REFIT

While this should sort the problem, it will slow the program down
as every fit has to be performed twice.