Table of Contents

1. Program Description and Goals

With the completion of the human genome sequencing effort and many more
genomes readily available in databases since then, the focus in life sciences shifts
from pure sequencing towards proteomics. In short, understanding the structure and
cellular function(s) of proteins, their interaction partners, and putative involvement
in disease is now of interest.

Technological advances have contributed to high-throughput generation of a large body
of experimental data that exists in heterogeneous databases world-wide. However,
determining the structure, function, and interactions of proteins with high reliability
has proven to be a difficult task. Bioinformatics tools provide means for efficient
analysis of existing data and give biologists a plethora of possibilities to determine
function and properties of proteins through database comparisons and de novo prediction
methods. Most of these technologies are freely accessible on-line.

Working with these services has become an essential part of every experimental
biologist's research. Using the often rough-hewn interfaces and performing the necessary
adjustments of parameters and data to fit the input format is tedious and
time-consuming. Receiving the different outputs typically requires manual editing before
it is possible to compare the results and draw conclusions.

CoMPAS Pro integrates useful web services for biological research, so input is only
needed once. The results obtained form the integrated services are homogeneously
combined, formatted, and presented in an intuitive fashion, which facilitates easy
design of further experiments, publication, and drawing conclusions.

While CoMPAS Pro is designed to run on a local computer, it is at the same time
communicating with the individual web services it uses, by distributing requests to
several servers to speed up processing time and balance the computational load.

For creators and developers of existing and future web services, we offer an
interface with which they can easily integrate their service into our framework. We use
the Web Services Description Language (WSDL), a proposal for a World Wide Web Consortium
standard, which describes the capabilities of a web service and its input/output formats
in a variant of the Extensible Markup Language (XML).

Through WSDL, communication between the services and our stand-alone client is
standardized using the protocol SOAP over the Internet.

The source code and binaries are publicly available for extension and modification,
under the Apache License, Version 2.0, in order to assist convenient integration of
services into CoMPAS Pro.

CoMPAS Pro has an intuitive graphical user interface (GUI), including easily
adjustable prediction parameters that are pre-set to scientifically sound defaults for
an out of the box start. It is possible to enter several sequences at once, compare
their results, and display a synopsis. The output of different services for one or more
sequences can be displayed in an comprehensive overview facilitating conclusions to be
drawn from their combination.

2. Projects

Main Window

This is the main window of CoMPAS Pro. On the left side you can see the project
tree. This is much like a directory tree which lets you navigate through all steps
of your project. In the first position you can see all your sequences. By clicking
on the sequence folder itself, you can add more sequences to this folder. This
screen can be seen on the right and will be described in the next section in detail.

Under the service folder, you can see all the services that are enabled for this
project. They can be expanded for access to their parameters and results. The
services and their settings are described in the next chapter.

Above the project tree, you can see the toolbar, where you can load and save
project files. Only one project can be open at one time. The "Run" button starts all
the active services with the parameters that are currently set. Results will be
calculated for each sequence and can be viewed on each sequence's settings pane
and/or on the service's result pane. That way you can either compare all results for
a certain service or all the results for a certain sequence.

Sequences

A screenshot of the sequence input screen, where you can paste your
sequence data or load a fasta file. This is the screen you will see when
starting a new project from scratch.

To start a project, it is a good idea to add sequence data first. The options are
raw sequence data, FASTA-formatted files or text. All sequence-relevant functions
can be accessed via the project tree, by clicking on "sequences".

Adding raw sequence data

To add raw sequence data, you can just paste or type your sequence directly.
When you are done, click on the button "+ Add sequences" and a new item called
"New Sequence" will appear under the "sequences" folder in the project
tree.

Adding FASTA formatted sequences

If your sequences are already in FASTA-format, you can paste them in the text
area as with raw sequences, or open a FASTA-file by clicking on "Browse for
sequence files..." Sequence names and sequence data will be read from the FASTA
data and added to the project tree. Changes to the sequences can be made as
described above.

After you added some sequences, they will appear in the project
tree on the left.

Changing the name and data of a sequence

To change the name or the sequence data, click on the sequence name as it
appears in the project tree on the left. A new screen appears on the right that
displays your sequence, the name as it was found in the header file, or the
words "New Sequence" if you added raw sequence data. Modify them as you like and
click on the "Save changes" button.

Here you can modify the sequence header and data. Once results are
there, you can see them here.

Deleting sequences

If you want to delete a sequence that is displayed in the current project,
select it in the sequence tree and press the "Del" (delete) key on your
keyboard. The sequence will be removed from the project.

Results

After a run, results for this sequence from all active services are displayed
on the lower part of the sequence screen. Results can be exported and viewed in
an XHTML-enabled browser, such as Mozilla Firefox by clicking on the "Export
Results" button. These files can be easily used for further analysis or
publications.

When the results are here, you can view them sequence-wise by
clicking on the sequence name in the project tree and scrolling through
the results that are available for this sequence.

Services

All services that are availabe for this project are listed in the "Services"
folder in your project tree. If you expand a specific service by clicking on the
small "+" box or arrow next to its name, you will see "input parameters" and
"results" as selectable pages. Instead of clicking on the expand-symbol, you can
also click on the small checkbox icon or the sequence name to activate or deactivate
a service. When the box is checked, the service is active and will be queried the
next time you press the "Run" button. If it is not checked, the service will not be
queried until you activate it. This setting is saved in the project file.

Setting Input Parameters

When you select the "input parameters" page directly under a service, a new
screen appears on the right displaying the service name, a description and the
parameters that can be adjusted before each run. A short description of each
parameter is next to its edit field. Those fields usually are drop-down boxes
where you can select one of several options, a spin box where you can set a
number for a certain parameter such as a score threshold, or a checkbox for a
parameter that can be either on or off.

Service input parameter screen of the Service "SVMHC". Desription
and layout depends on the service you are looking at.

Note

The settings you select here are immediately saved and used for the next
run. If you want to preserve them between different sessions of using this
project file, you still need to save the file.

Results

By clicking on "results", the results received in the last run for this
service are displayed. They are either textual results or graphical, depending
on the service. If you have many sequences, graphical results may take some time
to load since they are computed on the fly for flexibility. Results can be
exported and viewed in an XHTML-enabled browser, such as Mozilla Firefox by
clicking on the "Export Results" button.

An example of a service that returns a result in text form, "MultiLoc".

Example of a service that returns graphical (SVG) results, "SVMHC".

Running Services

When you have added all the sequences you want and set all the parameters for the
services, all you need to do is check that all services you want to use are active.
Click on the "Run" button in the toolbar to send the sequences to the services. The
results will be computed.

Note that the cursor changes into an hourglass until the run is complete. You can
set the timeout on the preferences screen. When all services have responded or timed
out, you will be notified with a dialog box that lists the services that have
failed, if any.

Dialog box that will appear when the run is complete and no errors have
occurred.

Note

Especially when first starting CoMPAS Pro, establishing a connection to the
services may take a long time. If a service fails initially, it may be a good
idea to re-run it and see if results are now forthcoming.

You can view the results by clicking on either a sequence name or the "results"
entry under a service name.

Preferences

By clicking either on "project" in the project tree or the "Preferences" button on
the toolbar, you can see the preferences screen. Here, you can set your HTTP proxy
server and port if this is necessary for your network environment. You can ignore
this setting if your firewall allows outgoing port 80 (HTTP) traffic or if you have
a transparent proxy. A good way of finding out is running all services without a
proxy. If they all fail, there is a good chance you may need to adjust this setting.

The global preferences screen with a proxy entered.

Define a timeout for the services by adjusting the timeout setting. If you are
running with a lot of sequences it may be necessary to set this to a value higher
than 60 seconds.