Testvariants to VCF

The perl script testvariants2VCF-v2.pl converts the output of the cgatools “testvariants” command into a multi-sample VCF file. The script requires a cgatools installation and access to a reference genome encoded in .crr format (build36.crr or build37.crr). For additional information, see the CGATM Tools documentation. Note that this is version 2 of the tool (hence the “-v2” in the name). This version corrects a bug in the original script – please see the README contained in the download archive for details.

Variants that share the same location (chr,begin,end) will be merged into one locus and their flags (0,1,N) will be converted into genotype calls. Samples that are positive for more than two alleles within the same locus will be flagged and their genotype calls set to unknown (./.). For a non-SNP locus, the VCF format requires that an extra reference base immediately upstream of the variant locus be included in the REF and ALT columns.