UK Biobank Whole Genome Sequencing project

UK Biobank Whole Genome Sequencing project

UK Biobank Whole Genome Sequencing project

UK Biobank is a health resource with unparalleled research opportunities. UK Biobank data will enable the scientific community to understand, diagnose, treat and prevent life-changing diseases– including cancer, heart diseases, stroke, diabetes, arthritis, osteoporosis, eye disorders, depression and forms of dementia. UK Biobank is following the health and well-being of 500,000 volunteer participants and provides anonymised health information to approved researchers in the UK and overseas, from academia and industry.

About the Partnership

Together with deCODE in Iceland, the Sanger Institute will read and assemble the whole genome sequences of 500,000 UK Biobank volunteers. The Sanger Institute will sequence 225,000 whole human genomes.

This is one of the most ambitious sequencing efforts of whole human genomes ever undertaken. Sequencing will take place over 27 months, starting in September 2019. This builds on the ongoing success of the pilot programme, known as the Vanguard project, in which Sanger is sequencing 10 per cent of the cohort - 50,000 genomes of UK Biobank volunteers.

A dataset of this magnitude will be incredibly powerful for understanding the genetic architecture that contributes to diseases such as cancer, cardiovascular diseases, depression and dementia.

The £200m project is being funded by a collaboration between the government’s research and innovation agency, UK Research and Innovation (UKRI), Wellcome, Amgen, AstraZeneca, GlaxoSmithKline (GSK) and Johnson & Johnson*.

The first tranche of data is expected to comprise of up to 125,000 sequences, anticipated to be accessible to all in Spring 2021 and at the same time the 50,000 Vanguard sequences will be available.

The expectation is that sequence data for the entire cohort of UK Biobank participants would become generally accessible by early 2023.

Sanger Institute People

Langford, Cordelia

Sanger Team Partners

The Illumina High Throughput Sequencing team runs the Institute's large scale sequencing facility. It is unique in scale and processing capability. As such, and for more than 20 years, this team has produced most of the institute's sequencing data across a variety of platforms.

Our Faculty work closely with our Scientific Operations teams which are responsible for all data production pipelines at the Institute. There are three main facilities; DNA Operations , Animal and Mouse Pipelines and Cellular Operations. We have smaller core facilities for Protein Mass Spectrometry, Cytometry, Cytogenetics and Single Cell Genomics.