FTDNA says there are over 60,000 37-marker records in their database and a large portion of them are likely to be in R1b1b2. I wonder if anyone has thought of doing a meta-analysis of them to determine how the surnames are related? I don't know exactly how this would be done but the first step might be to reduce the 60K records to relatively homogenous lineages within each surname. For example, you would have Smith1, Smith2, etc. One could exclude all lineages with less than a certain number of members and then calculate the modal halotypes for the remainder. The analysis could then be done on these haplotypes. I am sort of expecting that this might yield a chronological ordering of the surnames according to their distance from our patriarch who waited out the last ice age in southern France. One would have to play a bit with the paramters for homogeneity and minimum lineage size, of course. Has anything like this been done already?

That would be a great idea, but, perhaps it would be easier and faster if a group of members here split the work up so one person would net get stuck toiling for hours on the project -- better yet -- how about ten, twenty, thirty, or how-ever-many-we-can-get people splitting the work up into small portions and we would get it done in no time ?

I for one would like to see my surname (Murphy), the most common surname in Ireland, broke down into the various haplotypes and other data sets --- but the task would be more than I could do alone.

I was hoping that the FTDNA statisticians might be able do this mechanically, without having to look at the individual records. Indeed, it would be a huge project if one had to do it by hand. There are statistical techniques, such as principal components, that could extract a number of subgroups for each surname depending on the correlations between the individual haplotypes.

I was thinking a bit more about what it might tell us. What we would like to know is the closeness other surnames to our own. We get a glimpse of this now when we look at the Y-DNA Matches page, but this can be misleading because individuals vary quite a bit from their lineage halpotype. In what I am proposing, we would comparing the distance between the modal haplotypes of each lineage. FTDNA might develop a tool like their FTDNATIP, that anyone could use. It would be useful to be able to supply a parameter to determine the minimum number of individuals in each lineage that is selected for analysis.

So, for the Murphy name there are probably multiple lineages that would come out of this, as they probably are already in your Results page. These would be compared to the modal haplotypes of other surnames and ranked accordingly.

The idea that our common R1b1b2 ancestor waited out the last Ice Age in southern France or in Iberia is pretty much obsolete.

R1 (M173) itself, the ancestor of R1b1b2 (M269), is only about 18,500 years old.

R1b1b2 (M269) isn't old enough to have been in existence during the last Ice Age. It most likely arrived in Europe from western or southwestern Asia sometime well after the last Ice Age, probably during the Neolithic Period.

I was only joking about our R1b1b2 ancestor waiting out the ice age :-). In any case, my further thinking (see reply to Mr. Murphy) is more about computing genetic distances between modal haplotypes of surname lineages. This has the benefit of filtering out a lot of noise. If we can draw on the 60k plus records in the database, this could be quite interesting.

Craig and I were discussing this concept offline and I created a quick cut at what I could imagine. It follows somewhat along the lines of Murph's comments. Take a look at this page to see if you can imagine this going anywhere.

I would imagine there might be a separate compilation for each Deep Clade. Not sure about the best way to present it - but if 60,000+ results represented "only" 6,000 Genetic Lineages - that would still be an absurd number of results to compile into one table.

However - if there are "only" 100 or 200 Genetic Families in my Deep Clade (R1b1b2a1b7c), I can imagine compiling them into one table and seeing how they compare to each other - now. And - following up, as time passes and we get even deeper into the Clades as we have even more relevant SNPs.

My hope is to someday track the Deep Clade divisions right into our family Lineage and then see a SNP that correlates to our main branch at CDY. (We have over 70 men in our large Barton Lineage I)

Murph - take a look at what I've done with Barton Project and see if that is what you are wishing for Murphy.

I'm new to DNA analysis but would like to connect with County Wexford (Ireland) MURPHYs. My Genetree DNA results are:Email dasmurphy@hotmail.comLab standard NISTHaplogroup R-M207Sub Haplogroup R1b1b2-M269Confidence high

I suspect I belong to this same Murphy line. See "Haplotype C" in the Murphy Project.Mark Jost was kind enough to compare the numbers on this group. I hope Mark doesn't mind me quoting our conversation.

Mark:"Your group have a 37 marker IntraClade Coalescence (n-1) Age of 915 ybp (the beginning point in generations each have the same common ancestry).

The Modal Age is the point at the first beginning of this common ancestor based on the ancestral haplotype is about 1,197.5 YBP . This could be in the same or just prior generation. The main point is that you all are common ancestors at around 1,000 years before present considering the average birth years of your four. ie 30 year average length per generation. 67 markers might produce a little younger age.