Biological Clustergram Server

Exploratory Clustering

Paste your data to be clustered here, and select options you want below.You should probably keep the number of genes below 500 or so.
If you hit the 'Sample' button on the right of the input area, it will put in
a sample data file which you can experiment with.

Clustering can sometimes seem complicated - there are many ways to cluster,
and how you should do it often depends on what you're trying to look at. This
page was designed to help you look quickly through the results of changing
different parameters. For a description of these parameters I refer you to
the documentation to Mike Eisen's Cluster and TreeView programs: http://rana.lbl.gov/manuals/ClusterTreeView.pdf.
Another page of some interest which shows the variation in results you can get
with different clustering metrics is http://ep.ebi.ac.uk/Docs/dist_clust.

Thanks to Gavin Sherlock for allowing me to use XCluster as the clustering
engine for this page. This also means that your data needs to be in a format
that is readable by XCluster. A description of the file format can be found at
http://genome-www.stanford.edu/~sherlock/cluster.html#formats.
Some simple file format adjustment is taken care of automatically to make it
slightly more flexible, so you don't actually need a 'GWEIGHT' column and blank
lines get filtered out.

Output is generated by slcview.

I do not keep track of who uses this page nor what they cluster. All files
generated by clustering your data are (usually) deleted shortly after they are
generated. If you still have concerns about the privacy of your data I
suggest you not use this page.

If you select 'Test' for an option, it will cluster your data both ways. For
example, if you select 'Test' for log transform, then it will cluster your data
2 ways - first log transform, then cluster the data; and cluster the data
without doing a log transform. 'Always' and 'Never' will affect how all the
clustering is done. For example, selecting 'Never' for Gene Centering means
that whatever other options are selected, none of them will have been
gene-centered.

The order of operations for preprocessing data is the same as in Mike Eisen's
Cluster program - log transform, center genes, normalize genes, center arrays,
then normalize arrays.

Please try not to select 'Test' for all the options available, in the interest
of server load. Also please do not put in massive data sets for the same
reason. But you are allowed to use your own definition of massive - please be
patient, as you will sometimes have to wait to cluster 32 or 64 data sets and
create clustergrams for all of them before anything shows up on your screen.
It seems that several hundred genes is probably ok, but when you get
over 500 or so the web server seems to have some limitations so that it will
not generate the pictures. I'm not sure why, but if you don't see any images
and also don't see any error messages after you hit submit and the page loads,
it's likely your data set was too large.

This page was designed so that you could take the options you like and use their
corresponding options in Mike Eisen's Cluster program and get relatively similar
output. If you use Gavin Sherlock's XCluster you should definitely get the same
output, but you will probably have to preprocess your file first to do the
gene/array centering/normalization, etc. (XCluster, which does the clustering
for this web page, uses Average Linkage Hierarchical Clustering.)