fineSTRUCTURE can use this coancestry matrix to classify individuals into clusters, 52 in this case (compared to 38 using PCA and MClust). You can check the cluster assignments in a spreadsheet.

Note that I have named the clusters. That's just a shorthand so we don't have to refer to them by cluster number. Instead I used the population with the largest number of individuals in a cluster to label that cluster.

Here's the cluster-level coancestry heat map.

And the pairwise coincidence:

And finally PCA plots for the first 10 dimensions from fineSTRUCTURE.

UPDATE (Feb 9, 2012): New PCA plots with better markers for the clusters.

That doesn't seem too bad. How many SNPs/threads and which software did you use for phasing? I have phased a dataset of similar size with 3 threads using shapeIT and that took days, so I am wondering whether I should try something else.

I am curious what sort of processing "oomph" you have for compute-intensive stuff like this. Performance is always relative to such things like number of processors, processor speed, cache, memory, etc., no.

Can I ask you for more specifics on your hardware, is it for instance:
1. Quad-processor or above?
2. How much RAM?
3. What OS, Windows 7 or some variant of Linux, like Ubuntu or something?
4. How is storage organized, RAID array, SATA or whatever?

Well 23andMe tells me that 2 of my Relatives are Indian - both 5th cousins, 4-10 range - one called Bennett, one called Thakrar - who now live in NZ and Kenya respectively.

So it may be that I actually have some recent South Asian ancestry?! When I search for my name Conroy in an online database of British Army stationed in India, I see that there were 38 enlisted men and 2 officers called Conroy in Bengal alone. Of course Bengal is not Pakistan, but the Connaught Rangers - an all Irish battalion of the British Army - battled the Pathans and others in the region. I'm wondering if one of them took a "War Bride" back to Ireland??

Pcontroy, if you're referring to Dr. McDonald's analysis as far as your South Asian admixture is concerned, don't take it too seriously. McDonald uses what may be deemed "mixed" samples. For instance, you scored around 3.1% and 3% with the Pakistani Sindhi and Pakistani Pashtun respectively. It is very likely that these small percentages are popping up due to shared ancient ancestry with the said groups; as opposed to having any real South-Asian admixture. The Pathan and the Sindhi both have appreciable levels of North(-east) European admixture. It seems unlikely to me that you'd have any real, non-trivial and recent South-Asian ancestry. We could say the same for Gedrosia - it seems to be found in non-trace levels in most West-Eurasian populations and is probably simply a signature of generic West-Eurasian ancestry as opposed to anything real.

Thanks!
I guess the coancestry plot doesn't say much. The finestructure PCA plots are easier to read. It looks like the plots are symmetric with respect to the transpose. Is there a way to figure out which populations are donors, and which are the acceptors? for example, from the PCA plot, the Vysya group(on vertical axis on the left) has a blue line corresponding to kanjar, singapore 3 and dharkar, while this is transposed also(if you look at Vysya on the horizontal axis on top). So does this mean that the genes flowed both ways?

ChromoPainter can be run two ways. One is to define specific populations as donors and compute the results for everyone based on those donors.

The other is an all-against-all mode. Here you assume that for an individual all other samples are donors. This is what I did in this analysis. So you cannot find out direction of gene flow but you can make inferences about clustering and haplotype similarity etc.