Contribute

Galaxy Project

Wiki

Displaying Datasets at External Display Applications / Websites

About Display Applications

There are many services available that allow users to upload their own data for use as viewing as e.g. 'custom tracks' in a genome browser. Examples of these include the UCSC genome browser, GeneTrack and GBrowse. Adding a new external display applications requires two steps:

define the display application (create a new XML definition file) and

instruct Galaxy to load the display application (edit datatypes_conf.xml with the location of the new XML file).

Citation

If you add a new external display application to Galaxy in your published work, please cite: Blankenberg D, et al. In preparation.

Topics

Basic Topics - Simple Display Applications

Advanced Topics - Dynamic Display Applications

Basic Topics

Simple Display Applications

Display applications are defined using XML.

Example 1

Lets suppose we want to write a display application which displays a BAM file at the UCSC genome browser. After familiarizing ourselves with the UCSC genome browser, we become aware of several pieces of information which are needed to display user data:

The data to be displayed is provided by giving a public URL to the UCSC genome browser.

3 data files are needed to be provided by URL: 1) a custom track definition, 2) the BAM file and 3) the BAM index. With the following requirement: the index must have the same name as the BAM file, but have the additional suffix of '.bai'

The url to send data to UCSC is of the form: http://genome.ucsc.edu/cgi-bin/hgTracks?db=UCSC_GENOME_BUILD&hgt.customText=URL_OF_CUSTOM_TRACK

This display application will require 3 parameters, corresponding to the 3 data files that are needed.

The custom track file is defined by a dynamically generated template-style parameter. It is referred to as 'track', and is available at http://some/url/path/track. The contents of this file are created by replacing the '$' placeholders with the values indicated.

The BAM file, the actual dataset being used for the display, is internally referred to as 'bam_file', but will be available as http://some/url/path/galaxy.bam

The index file, which is available as a metadata attribute of the BAM file, is internally referred to as 'bai_file', but will be available as http://some/url/path/galaxy.bam.bai

This display application can then be defined like this example (display_applications/ucsc/bam.xml/):

Example 2

Now lets create a display application that can view interval (BED-like) files at the UCSC genome browser. The UCSC genome browser natively handles BED files, but Galaxy allows a looser definition for BED files than that accepted at UCSC, so we have defined a new datatype called 'bedstrict' that has data that meets the strict definition of BED. To enable all types of intervals to be viewed at UCSC, we have created an interval to bedstrict datatype converter and defined it in the usual manner (LINK TO THIS DOCUMENTATION).

Items needed for this display:

The data to be displayed is provided by giving a public URL to the UCSC genome browser.

1 data file is needed to be provided by URL: 1) the result of converting the interval file to bedstrict

The url to send data to UCSC is of the form: http://genome.ucsc.edu/cgi-bin/hgTracks?db=UCSC_GENOME_BUILD&position=CHR:START-END&hgt.customText=URL_OF_STRICTBED_FILE

The 'position' (viewport) in the form of CHR:START-END. This is calculated by looking at the first 10 lines of the bed file. This is not available as a URL (viewable is not set to True); but is instead substituted in as part of the URL that users are directed to for display.

where the 'strip' attribute of 'position' indicates that whitespace should be removed from around the determined position before being substituted into the URL. This can also be accomplished using ordinary string operations like: position=${qp(str( position).strip())}. 'qp' is a shortcut to 'quote_plus' which escapes text for use as part of a URL

To instruct Galaxy to use this display app, we modify datatypes_conf.xml and add the following to the interval datatype:

where our xml file was saved as /display_applications/ucsc/interval_as_bed.xml and inherit is set to True, so that datatypes subclassing interval (e.g. bed, bedstrict) also have the display application available

Example 3

Create a display application that can view interval (BED-like) files at GeneTrack. GeneTrack is a browser/peakcalling program that is a server distinct from Galaxy, but which is able to directly access the file system used for storing Galaxy datasets; this direct file system access requires individual Galaxy instances to have a (local) instance of GeneTrack installed.

where our xml file was saved as /display_applications/genetrack.xml and inherit is set to True, so that datatypes subclassing interval (e.g. bed, bedstrict) also have the display application available. Note that a BED to GeneTrack Converter has been defined for the BED datatype and is used to create the GeneTrack file to be displayed

Advanced Topics

Dynamic Display Applications

Sometimes it can be desired to have links and parameters for a Display Application to come from an external file. In the following examples we will modify the above examples to be populated using data found in external files.

The file tool-data/shared/ucsc/ucsc_build_sites.txt contains tab delimited data, with one line per display site, consisting of columns:

Site Name

Site URL

Comma-separated Genome Builds Available at this site

The display will also be filtered based upon a Galaxy Application configuration variable ucsc_display_sites which is used to to restrict available sites to a list specified in universe_wsgi.ini; alternatively the non-desired sites can be removed by commenting or deleting them out in tool-data/shared/ucsc/ucsc_build_sites.txt.