README.md

GLPhase

This is a cuda-enabled fork of
SNPTools impute.cpp. This
code should scale linearly with sample size up to a small multiple of
the number of CUDA cores (shaders) on the GPU being used.

GLPhase also has an option for incorporating pre-existing haplotypes
into the phasing and imputation
process. Release 1.4.13
was used with this option
to impute genotypes for the first release of the
Haplotype Reference Consortium.

Installation

Dependencies

Compilation

# Clone this repository recursively
git clone --recursive https://github.com/winni2k/GLPhase.git
cd GLPhase
# to compile all code (with all optimizations turned on)
make
# run the glphase executable to get a description of the
# glphase command line arguments
bin/glphase
# run regression tests (turns off optimizations)
make test
# run regression tests + longer integration tests
make disttest
# compile without CUDA support
# first clean the work dir
make clean
make NCUDA=1
# compile without CUDA or OMP support (on MacOSX for example)
make NCUDA=1 NOMP=1

Converting a VCF to SNPTools .bin format

A perl script at scripts/vcf2STBin.pl can be used to convert a VCF
with PL format fields to a SNPTools conformant .bin file. For
example, this command will convert a gzipped input VCF at
input.vcf.gz into a SNPTools .bin file at input.bin:

scripts/vcf2STbin.pl input.vcf.gz

Running GLPhase (v1.4.13)

As a drop-in replacement for SNPTools/impute.cpp

GLPhase can be run as a CUDA-enabled drop-in replacement for
SNPTools/impute.cpp. Assuming a SNPTools style .bin file with
genotype likelihoods exists:

bin/glphase input.bin

Using pre-existing haplotypes

GLPhase can use pre-existing haplotypes to restrict the set of
possible haplotypes from which the MH sampler may choose surrogate
parent haplotypes. This approach is described in: