Abstract

We have developed a software platform called Osprey for visualization and manipulation
of complex interaction networks. Osprey builds data-rich graphical representations
that are color-coded for gene function and experimental interaction data. Mouse-over
functions allow rapid elaboration and organization of network diagrams in a spoke
model format. User-defined large-scale datasets can be readily combined with Osprey
for comparison of different methods.

Rationale

The rapidly expanding biological datasets of physical, genetic and functional interactions
present a daunting task for data visualization and evaluation [1]. Existing applications such as Pajek allow the user to visualize networks in a simple
graphical format [2], but lack the necessary features needed for functional assessment and comparative
analysis between datasets. Typically, interaction networks are viewed within a graphing
application, but data is manipulated in other contexts, often manually.

To address these shortfalls, we developed a network visualization system called Osprey
that not only represents interactions in a flexible and rapidly expandable graphical
format, but also provides options for functional comparisons between datasets. Osprey
was developed with the Sun Microsystems Java Standard Development Kit version 1.4.0_02
[3], which allows it to be used both in stand-alone form and as an add-on viewer for
online interaction databases.

Network visualization

Osprey represents genes as nodes and interactions as edges between nodes (Figure 1). Unlike other applications, Osprey is fully customizable and allows the user to
define personal settings for generation of interaction networks, as described below.
Any interaction dataset can be loaded into Osprey using one of several standard file
formats, or by upload from an underlying interaction database. By default, Osprey
uses the General Repository for Interaction Datasets as a database (The GRID [4]), from which the user can rapidly build out interaction networks. User-defined interactions
are added or subtracted from mouse-over pop-up windows that link to the database.
Networks can be saved as tab-delimited text files for future manipulation or exported
as JPEG or JPG graphics, portable network graphics (PNG), and scalable vector graphics
(SVG) [5]. The SVG image format allows the user to produce high-quality images that can be
opened in applications such as Adobe Illustrator [6] for further manipulation.

Searches and filters

A drawback of current network visualization systems is the inability to search the
network for an individual gene in the context of large graphs. To overcome this problem,
Osprey allows text-search queries by gene names. A further difficulty with visualization
systems is the absence of functional information within the graphical interface. This
problem is remedied by Osprey, which provides a one-click link to all database fields
for all displayed nodes including open reading frame (ORF) name, gene aliases, and
a description of gene function. By default, this information is obtained from The
GRID, which in turn compiles gene annotations provided by the Saccharomyces Genome Database (SGD [7]). Various filters have been developed that allow the user to query the network. For
example, an interaction network can be parsed for interactions derived from a particular
experimental method. Current Osprey filters include source, function, experimental
system and connectivity (Figure 2).

Network layout

As network complexity increases, graphical representations become cluttered and difficult
to interpret. Osprey simplifies network layouts through user-implemented node relaxation,
which disperses nodes and edges according to any one of a number of layout options.
Any given node or set of nodes can be locked into place in order to anchor the network.
Osprey also provides several default network layouts, including circular, concentric
circles, spoke and dual ring (Figure 3). Finally, for comparison of large-scale datasets, Osprey can superimpose two or
more datasets on top of each other in an additive manner. In conjunction with filter
options, this feature allows interactions specific to any given approach to be identified.

Color representations

Osprey allows user defined colors to indicate gene function, experimental systems
and data sources. Genes are colored by their biological process as defined by standardized
Gene Ontology (GO) annotations. Genes that have been assigned more than one process
are represented as multicolored pie charts. Osprey currently recognizes 29 biological
processes derived from the categories maintained by the GO Consortium [8]. Interactions are colored by experimental system along the entire length of the edge
between two nodes. If a given interaction is supported by multiple experimental systems,
the edges are segmented into multiple colors to reflect each system. Alternatively,
interactions can be colored by data source, again as multiply colored if more than
one source supports the interaction. When combined with filter options, a network
can be rapidly visualized according to any number of experimental parameters.

Osprey download

A personal copy of the Osprey network visualization system version 0.9.9 for use in
not-for-profit organizations can be downloaded from the Osprey webpage at [9]. Registration is required for the sole purpose of enabling notification of software
fixes and updates. A limited version of Osprey used for online interaction viewing
can be used at The GRID website [4]. For implementation of Osprey as an online viewer for other online interaction databases
please contact the authors.

Acknowledgements

We thank Hosam Abdulrrazek for contributions to our layout algorithms, and Lorrie
Boucher, Ashton Breitkreutz and Paul Jorgensen for suggestions on Osprey features.
Development of Osprey was supported by the Canadian Institutes of Health Research.
M.T. is a Canada Research Chair in Biochemistry.