Recent microarray and resequencing studies show that genomic deletions occur throughout the human genome. These deletions can occur in homozygous state in apparently healthy individuals, indicating a complete loss of part of our genome without apparent phenotypic consequences. The aim of the current study is to create a first map of validated homozygous deletions in healthy individuals. This map will give insight into the functional and dispensable parts of our genome and can ultimately result in the definition of the minimal human genome.

Genome-wide copy number analysis was performed in 600 DNA samples of healthy Dutch volunteers using the Affymetrix SNP 6.0 array. After quality control, we identified over 2,000 homozygous deletions of more than 10kb in size, distributed over 75 distinct regions. Approximately 65% of these homozygous deletions are recurrent, of which one third occurs frequently (in 5% of the samples). After validation by PCR we define 3.7 Mb of genome sequence that is subject to homozygous deletions in these subjects. The regions contain 39 protein-coding genes and 175 non-coding RNA (ncRNA) loci. Constrained ncRNAs and other functional sequences are depleted in homozygous deletions e.g. a 24% depletion in phastCons sequence (p = 0.075) showing that a complete loss of functional sequence is selected against. In addition, genes encompassed by homozygous deletions are significantly smaller (p = 0.007), and contain fewer introns (p = 0.007), which are characteristics of environmental (e.g. olfactory and immunity) genes.

Our data support the notion that deletion alleles preferentially segregate in the human population only when they do not encompass functional elements, or else those that are less essential for viability. In conclusion, our data show that at least 0.1% of our genome is dispensable without apparent deleterious effect, thus providing a first indication of the size of the minimal human genome.