ITEP: Integrated Toolkit for Exploration of Pan-genomes

ITEP, the Integrated Toolkit for Exploration of microbial Pan-genomes, is a suite of scripts and Python libraries for the comparison of microbial genomes. It includes tools for de novo protein family prediction by clustering, ortholog detection, analysis of functional domains, identification of core and variable genes and gene regions, sequence alignments and tree generation, cluster curation, and the integration of cross-genome analysis and metabolic networks for study of metabolic network evolution.

Virtual machines

Virtual machines (.ova format) for running the software are available for download at http://goo.gl/nM4DWr (64-bit) or http://goo.gl/gqh6BG (32-bit). The virtual machines include all required dependencies and those optional dependencies which have licenses that permit free distribution. The virtual machines can be opened on any operating system using tools such as virtualbox (https://www.virtualbox.org/). Be warned that some analyses (particularly RPSBlast) do not work with the 32-bit version due to memory limitations.

Source code and documentation

The code and documentation on the virtual machine and on this website (below) is provided as published, and as such will not have the latest features and bug fixes as the code is developed. You can get the latest code and documentation from the project’s github repository (using either your own Linux machine or within the virtual machines linked to above). The latest version of the code can always be found at: http://github.com/mattb112885/clusterDbAnalysis

Release documentation and software files

To use these you will first need to unpack them. On Mac or Linux use the unzip command:

$ unzip "name_of_zip_file"

The code won’t run on Windows (only on the virtual machine) but you can unzip it using tools such as 7-zip. The tutorials are in Markdown format, which can be read most easily using various browser plugins (such as this one) or other tools.