Click here to access the tool for getting clusters of similar genomes as defined at different GSS or DNA signature thresholds.

Find here data regarding genomic similarity scores (GSS) and their use for reducing the set of genomes in our database into a non-redundant set for comparative genomics, or for whatever else you might want to do with each group or cluster of similar genomes. The measure for similarity is explained in several articles by members of our group. We also provide distances between genomes based on di-, tri- and tetra-nucleotide signatures. The R scripts require three libraries: cluster, MCMCpack and ape. The scripts will produce groups of redundant genomes, transform the score tables into matrices, and provide dendogram files in Newick format (common format for phylogenetic trees).

Click here to access the tool for getting clusters of similar genomes as defined at different GSS or DNA signature thresholds.

Click here to get the file containing all-against-all GSS as used in the article and in the R-script provided below.

Click here for the R-script used to cluster the genomes using the GSS from the file above.

Agglomerative clusters in Newick format can be accessed here for GSSa, GSSb and GSSc.

Divisive clusters in Newick format can be accessed here for GSSa, GSSb and GSSc.