Biological data warehouse

InterMine is an open source data warehouse built specifically for
the integration and analysis of complex biological data. Developed by the Micklem
lab at the University of Cambridge, InterMine enables the creation of biological
databases accessed by sophisticated web query tools. Parsers are provided for
integrating data from many common biological data sources and formats, and there is
a framework for adding your own data. InterMine includes an attractive, user-friendly
web interface that works 'out of the box' and can be easily customised for your
specific needs, as well as a powerful, scriptable web-service API to allow
programmatic access to your data.

Complex data integration

InterMine was developed with the complexity of biological data in mind. The data model is
flexible and extensible, and a range of data parsers is provided to facilitate the data
loading. A sophisticated identifier resolution system updates all identifiers to the most
current version using a priority system, and multiple post-processing checks ensure the
consistency of the data integration.

Fast and flexible querying

Complex queries can be constructed flexibly to mine across the integrated datasets,
enabling researchers to answer sophisticated biological questions. The query optimisation
method is constructed around the use of precomputed tables, meaning that the data schema
does not need to be denormalized to optimise query speed. A user's query workflow can also
be automated using InterMine web services.

Existing mines

A number of different data warehouses powered by InterMine already exist. These include:

Setting up your own InterMine instance is easy. Check out our step-by-step guide to creating your own Mine, loading and processing datasets, and using the web app to query & export the data. You can also try out an existing InterMine instance on the Amazon cloud.

The InterMine web services expose our complete API, and can be used for automating workflows, as well as enabling the development of external applications. While any language that can parse HTTP can be used, we provide client libraries in 5 programming languages - Python, Perl, Java, JavaScript and Ruby.