Scientific platforms

Science at scale

Sequencing

The Wellcome Sanger Institute has one of the largest DNA sequencing facilities in the world and is currently capable of producing more than 1700 terabases (1700 x 10 12 bp) of DNA sequence per year. Thanks to the latest Illumina hardware and bespoke software that was developed in-house, this is one of the most accurate and efficient sequencing facilities in the world.

Big data processing and analysis: EMBL-EBI

EMBL-EBI makes open access biological research data sets available. These are used extensively across the world by more than five million researchers in academia and industry. Some 18.5 million requests for data are made on a daily basis to EMBL-EBI’s websites. Analysing this data has become a bottleneck for life-science research and EMBL-EBI provides facilities to enable this work.

The Embassy Cloud provides private, secure, virtual machine-based workspaces within the EMBL-EBI infrastructure, in which clients can make optimal use of their own customised workflows, applications and datasets.

Embassy Cloud partners have direct access to the EMBL-EBI data, services and compute. This is a practical and cost-effective alternative to replicating services and downloading vast public datasets locally. The Cloud’s partner companies can access their workspace from anywhere in the world, reducing the need for capital investments in hardware and related operational costs.

Big data processing and analysis: Wellcome Sanger Institute

The data output from the Wellcome Sanger Institute is increasing all the time and the Institute has developed new technologies for storing and accessing the data. The iRODS (Integrated Rule-Orientated Data System) is a tool that is accessible to all for the management and distribution of sequence data.

The Institute has also developed more efficient data-storage formats that, like all the Institute’s software tools, are made available to the research community on an open-access basis.

Data storage

The Wellcome Genome Campus has a world-class supercomputing environment, providing the best production platforms and services. Between the Wellcome Sanger Institute and EMBL-EBI, there are several high performance compute clusters with may thousands of CPU cores. Data is served to researchers around the world, daily.

Storage is around 42 petabytes at the Wellcome Sanger Institute and around 100 petabytes at EMBL-EBI. As genomics research expands, so do the data storage and access requirements of researchers. We are adding more storage and greater processing power all the time, to meet these demands.

Single cell genomics

The single cell genomics facility can deliver thousands of single cell genomes, transcriptomes, and epigenomes, every day. This gives hugely valuable information about the state of a particular cell at a particular time, which supports research into general cell biology as well as applied cancer, immunology, and infectious disease research.

Cellular generation and phenotyping

The Cellular Generation and Phenotyping core facility provides central cell biology support, in particular to scale-up and automate existing protocols. The facility has expertise in cell derivation from primary tissue, induced pluripotent stem cell derivation, cellular differentiation, phenotypic assays, and end point analysis. In addition to providing enhanced skills to research groups, the facility attracts funding for research in its own right and also carries out contract work.

Stem cell informatics

The team has also developed WGE, a highly interactive, web-based visual tool that employs an embedded genome browser and database to assist scientists in designing genome editing strategies using the CRISPR/Cas9 system.

Cytometry

The Cytometry Core Facility provides state of art instrumentation together with assistance in running samples, data analysis and experimental design, in order to measure cell characteristics. Sorting is also provided as a service application. The facility supports a range of flow cytometric techniques and currently has six cytometers with a variety of possible applications.

Animal model pipelines

This facility provides and characterises knock-out mice for large scale research projects. They also provide and care for mice, zebrafish, rats and frogs that are used in research studies by scientists all over the world.

5,400bn

155

combined petabytes of storage between Wellcome Sanger Institute and EMBL-EBI

547

genomes of different species read in 2017

26,000

total number of compute cores in the Data Centre

Achievements and uniqueness

The Wellcome Genome Campus is unique in the world as the largest concentration of genomics knowledge and facilities. In this environment, ideas flourish to become world-changing discoveries that are applied to real-world problems.