Haplotype sorting display

When this display mode is enabled and genotypes are phased or homozygous, each genotype is split into two independent haplotypes. These local haplotypes are clustered by similarity around a central variant. Haplotypes are reordered for display using the clustering tree, which is drawn in the left label area. Local haplotype blocks can often be identified using this display.

Enable Haplotype sorting display

Haplotype sorting order:

using middle variant in viewing window as anchor.

To anchor the sorting to a particular variant, click on the variant in the genome browser, and then click on the 'Use this variant' button on the next page.

Description

This track shows approximately 73 million single nucleotide variants (SNVs) and
5 million short insertions/deletions (indels)
produced by the
International
Genome Sample Resource (IGSR) from sequence data generated by the
1000 Genomes Project
in its Phase 3 sequencing of 2,504 genomes from 16 populations worldwide.

Variants were called on the autosomes (chromosomes 1 through 22) and on the
Pseudo-Autosomal Regions (PARs) of chromosome X.
Therefore this track has no annotations on alternate haplotype sequences, fix patches,
chromosome Y, or the non-PAR portion (the majority) of chromosome X.

The variant genotypes have been phased
(i.e., the two alleles of each diploid genotype have been assigned to two
haplotypes,
one inherited from each parent).
This extra information enables a clustering of independent haplotypes
by local similarity for display.

Display Conventions

In "dense" mode, a vertical line is drawn at the position of each
variant.
In "pack" mode, since these variants have been phased, the
display shows a clustering of haplotypes in the viewed range, sorted
by similarity of alleles weighted by proximity to a central variant.
The clustering view can highlight local patterns of linkage.

In the clustering display, each sample's phased diploid genotype is split
into two independent haplotypes.
Each haplotype is placed in a horizontal row of pixels; when the number of
haplotypes exceeds the number of vertical pixels for the track, multiple
haplotypes fall in the same pixel row and pixels are averaged across haplotypes.

Each variant is a vertical bar with white (invisible) representing the reference allele
and black representing the non-reference allele(s).
Tick marks are drawn at the top and bottom of each variant's vertical bar
to make the bar more visible when most alleles are reference alleles.
The vertical bar for the central variant used in clustering is outlined in purple.
In order to avoid long compute times, the range of alleles used in clustering
may be limited; alleles used in clustering have purple tick marks at the
top and bottom.

The clustering tree is displayed to the left of the main image.
It does not represent relatedness of individuals; it simply shows the arrangement
of local haplotypes by similarity. When a rightmost branch is purple, it means
that all haplotypes in that branch are identical, at least within the range of
variants used in clustering.

Methods

The genomes of 2,504 individuals were sequenced using both whole-genome sequencing
(mean depth = 7.4x) and targeted exome sequencing (mean depth = 65.7x).
Sequence reads were aligned to the reference genome using alt-aware BWA-MEM
(Zheng-Bradley et al.).
Variant discovery and quality control were performed as described in
(Lowy-Gallego et al.).
See also: