Preprocessing

Motif search

Others

Here you can find detailed
explanations on how you can use our web service to search for spatial
residue patterns in proteins. You can easily click on the respective
category to show the instructions. Brief answers to most
usage-related questions are given in the
FAQ section
.

How does it work?

The search algorithm behind Fit3D is basically a combinatorial
search approach that works by evaluating two constraints: residue
composition and geometric similarity (root-mean-square
deviation/RMSD) of match candidates. In the following the basic
algorithmic procedure is elucidated:

determination of the maximal spatial extent of the
query motif

determination of the residue composition of the
query motif

for all search target structures, do

eliminate residues not element of the query motif
residue composition

combine sets with the size of the query motif, similar
spatial extent and matching residue composition (match
candidates)

determine RMSD for match candidates

Pairwise distance filtering can be used to eliminate match candidates with inter-residue distances not compatible to the query motif.
For detailed information on how the algorithm works and why it is a valuable structural motif search method, please refer to Kaiser et al. 2015, .

Prerequirements

Before you can start the large-scale screening for a
structural motif you will need:

the query structural motif in PDB format (you can
also the our extraction wizard or our API
to extract a query motif)

(optional) a list of search targets as text file
(you can also use our predefined lists)

To define a new structural motif you can use our
extraction wizard
that allows convenient and intuitive motif definition. You can
either access a PDB structure by its PDB-ID or you can upload a
structure in PDB format. The video below shows the
extraction process step-by-step using a known PDB structure:

Choose "Extract motif" in the navigation menu on
the left side.

Enter a valid PDB-ID or upload your own
structure.

On the next wizard page: select residues that
should be part of the query motif using the drop-down menu.

After clicking the button "Extract", the motif
structure is visualized and the toolbox shows further details.

Directly submit a new search job by clicking "Submit"
or download the motif structure for offline usage (click "Download")

Note: If the extracted motif is rated as too complex to be
processed by the web server, the
command line version
of Fit3D can be used.

Submitting a job

To submit a new job click "Submit job" in the left
navigation menu. A motif file in PDB format is mandatory and must
be provided by the user. Otherwise, the
extraction wizard
can be used to obtain a query structure. The target search space
can be customized by uploading a list of PDB-IDs separated by line
break. Alternatively, different pre-defined target lists, that
were computed based on sequence or structure homology analysis,
can be selected. The video below shows the submission
process in detail:

Position-specific exchangeDefinitions (PSEs) for the motif search can be
easily defined by using the advanced settings dialog. This can be
useful for analysis of diverse protein families or mutagenesis
experiments. PSEs are residue substitutions that are allowed
during motif matching. As the name suggest, Fit3D handles
position-specific residue substitutions, e.g. one can define a
single residue in the query motif to be also matched against other
residue types. The video below shows how to allow matching
of different amino acid types:

Activate PSEs for your query motif by clicking the "Define
exchangeDefinitions" button (important: the query motif must be
already known).

Select possible substitutions for single residues from
the drop-down menu.

Click "Set exchangeDefinitions" to apply your selection.

Note: The total number of position-specific exchangeDefinitions
(PSEs) is restricted to three.

View results

To view the results of your job click "View results" in the left
navigation menu. You can now select a corresponding job to show
all detailed results. Watch the video below to see the
whole process:

All jobs submitted during the active browser session are
accessible under the "View results" section.

If you used email notification you can recover all
results of the session later. Important: the results of
each job are deleted from our server after 72 hours.

Result page layout

When viewing the results of your job, motif matches are ranked
based on the lowest least-root-mean-square deviation (RMSD).
Hints on how to understand and interpret the RMSD are given in
the
FAQ section. If you want to investigate a single match in detail, just
click on "Show" or "Show in structure". Furthermore, you can
directly access the corresponding Protein Data Bank (PDB) entry
of the structure. Note that you can download all results or the
RMSD distribution plot by clicking on the "Action" drop down
menu on the toolbar at the very top of the result page.

p-value theory

Beside geometric similarity matches are rated according their
statistical significance. Every reported match to a query motif
is labeled with a p-value such that the user can assess
true positives intuitively. However, a significant p-value
does not necessarily indicate a true positive match
because the underlying models are based solely on the
distribution of RMSD values. Hints on how to understand and
interpret the p-value are given in the
FAQ section.
No biological features are represented by those models. To
estimate statistical significance, Fit3D implements two
statistical models:

Illustration of a statistical model to estimate significance of local similartities in protein structures according to Fofanov et al. 2008. A kernel density esimation of the match RMSD distribution is calculated using the Sheather-Jones bandwidth selection algorithm. Additionally, a point-weight correction (pwt) is applied to estimate matches beyond the RMSD limit by a maximum likelihood approach.