UK Biobank genotyping and imputation data release

*This document is now deprecated and should no longer be required. If you downloaded v2 of the genotyping and imputed data from EGA (with both UKB and EGA encryption) you may still need this document but going forward this will no longer be required.

What genetic data are available?

Genotype data are available for all 500,000 participants in the UK Biobank cohort. Genotyping was performed using the Affymetrix UK BiLEVE Axiom array on an initial 50,000 participants; the remaining 450,000 participants were genotyped using the Affymetrix UK Biobank Axiom® array. The two arrays are extremely similar (with over 95% common content). Further details can be found by clicking the links below:

If you wish find information on specific genetic loci measured on the Axiom array, please use the Genomic Search facility.

Timelines of data availability

The first batch of genetic data, which includes genotyping and imputed data (on approximately 150,000 participants) was made publicly available end May 2015. This includes the 50,000 participants genotyped using the UK BiLEVE array and about 100,000 participants genotyped on the UK Biobank array. The rest of the data are expected to be available very soon.

Applications for genetic data

UK Biobank would like to make the following comments on the scope of applications.

All applications require a clear stated hypothesis, such that UK Biobank can judge whether the application is by a bona fide researcher for health-related research, which is in the public interest.

It is ultimately for researchers to select the associations they wish to study from the genotyped and imputed SNPs and the UK Biobank phenotypes. UK Biobank may have a view on whether such associations (in particular whether the selected phenotypes) would be appropriate, but there is no underlying restriction in principle, which would serve to limit the scope of the associations that a researcher might choose to study. As such, as a matter of principle, UK Biobank would consider approving a suitable GWAS / PheWAS study.

UK Biobank would highlight that although some UK Biobank phenotypes are readily available, others may not be well-ascertained or may not be appropriately validated (at this time). By way of illustration, self-reported outcomes collected during the participant baseline visit are readily available. However, other phenotypes, such as validated outcomes for incident and prevalent disease depend on the availability of the health record linkage data (over which UK Biobank inevitably has less direct control).

In order to provide a level playing field, UK Biobank would request that if an application is submitted prior to 30 November 2015, then the relevant datasets will be made available as soon as reasonably possible thereafter. Further, UK Biobank will also accept amendments to existing applications – as long as the application still satisfies the criteria set out in paragraph 1 above – which will be dealt with in the same time frame.

UK Biobank intends to make available in due course a set of all (or at least the great majority) such associations through the Showcase database. This will be available to registered researchers (and a specific application to UK Biobank will not be required to access such summary data).

Use of a single genetic dataset

We have received a number of requests from institutions who would like to be able to store a single central genetic dataset, which can be linked a) between collaborators and b) for use on multiple applications from within the same institution. We support this proposal and going forward, we will release suitable bridging files to enable such linkage to take place. It would be helpful for our administrative team if, when applying, it could be made as explicit as possible as to the precise linkage required (in terms of the pre-existing genetic dataset and the identities of the collaborators). For the avoidance of doubt, the same approach to linking datasets between different applications still applies http://www.ukbiobank.ac.uk/wp-content/uploads/2013/10/UK-Biobank-data-linkage.pdf