The population tool GRAF-pop included in GRAF computes subject ancestries using genotypes and normalizes ancestry prediction in large datasets collected across different genotyping platforms, making it possible to generate population frequency based on more than a million dbGaP samples.

Who can use this?

GRAF is a tool for researchers; it is not designed to assess an individual’s ancestry or to find relatives.

You can use this tool against your own large datasets with results generated within hours or minutes, even when there is a very high genotype missing rate to the order of 99%. This tool can check genotype datasets obtained using different chips or platforms, plotting them in the same picture for comparison purposes.

NCBI’s database of Genotypes and Phenotypes (dbGaP) uses GRAF-pop for quality control and computation to generate population frequency data from dbGaP studies with more than a million samples. This population frequency data will allow for a larger set of populations against many more novel variants, compared to what has been available historically. Clinicians and researchers can use this frequency data in rare variant identification, variant interpretation, assay design, and many more applications.

What is the underlying technology?

GRAF is a downloadable C++ application, compiled for GNU/Linux.

This tool can output the data in .txt files and plot the data for export as .png files. It is also built into a CGI for dbGaP submitters and users to examine the results in dynamic web pages.