Find and Sum IBS Blocks

The perl script Find_IBS_Blocks.pl takes a pair of genomes and looks for regions of the two genomes that are likely to be identical by state (IBS).

The basic concept is that regions with IBS must share genotypes. For example, if we are looking for regions IBS for 1 haplotype, and we do not have phasing data, we would accept all combinations of genotypes as consistent with IBS, except 00 in one genome and 11 in the other. Clearly if 2 genomes do not share at least one allele at a position, they are unlikely to be IBS. So if we take blocks of eg 1Mb and look at all the variants within the block, we expect to find very few variants with a genotype of 00 11, and their paucity would indicate likely IBS.

The second perl script Sum_IBS_Blocks.pl takes the output of the first script (above) and for each 1Mb block, and counts the number of genomes with no or few inconsistent genotypes. Higher values indicate regions that are more likely to be IBS for pairs of genomes. Note that if more genomes are involved, you can run this pair of scripts on all pairs of genomes in your set and sum across all sets.