World Families Forums - The outlier's fallacy

ARCHIVAL COPY

WorldFamilies has changed our Forum Operating system and migrated the postings from the prior system. We hope that you’ll find this new system easier to use and we expect it to manage spammers much better. If you can’t find an old posting, please check our Legacy Forum to see if you can see the old posting there.

After having discovered some new fallacies, that of the “moderator”, that of the “off topic”, that of “fallacy” (surely I have gained a place in the Olbrecht-Tyteca treatise), I’d want to speak of another fallacy, that of the “outlier”. There is another fallacy always ready for me, that of the “English fluency”, but in this case is my English or are your minds?

On eng.molgen someone replied to my posting about the Afghan DNA and above all about my calculation of an outlier R1a1a which I posted also here and in at least two threads. Someone said that he’d have asked a new run (evidently he thought to a mistake).

The most serious has been Soulblighter (if everyone used his true name I’d know to thank who).Thread Pashtun vs Indic R1a1aPosted: Sun Apr 22, 2012 5:53 amby soulblighterI was reading the article "Afghanistan from a Y-chromosome perspective"They suggest that the Y-DNA R1a1a-M198(and G2c) in Pashtuns are from the Khazars.has there been an diversity study of Pashtun vs Indic R1a1a(STR and SNP)?

Posted: Sun Apr 22, 2012 7:41 amby GioielloProbably the paper, which isn’t for free, says that Ashkenzic R1a1a and G2c comes from Khazars, who converted Judaism, but Khazars had many origins, and probably these haplogroups didn’t come from the Turkic component, but from a Caucasian or Iranian one. This statement is important above all for understanding the origin of Ashkenazic gene pool rather than the distinction between Pashtuns and Indians.

Posted: Sun Apr 22, 2012 4:11 pmby soulblighterIf the R1a1 in Pashtun samples is from Khazars, then it should be distinct from the Indian samples. If this is not the case, the conclusion in the paper is obviously wrong. I did not think that R1a1 was common place in Iran, unless all Pashtun R1a1 is founder effect, there should be SNP and STR distinction with the Indic samples. The focus of the paper as I read it, does not seem to be Ashkenazim centric, but Afghan centric.

by soulblighterI finally sat down and read the paper in detail, and at the end of it, I am thoroughly confused The supplementary figures are free to look at hereSome highlights:

1) They argue that R1a1a-M198 originated in south central Asia (India-Pakistan) rather than Afghanistan or Russia based on high haplotype diversity(supplementary figure 4a) and older time estimates and high haplotype variance(suppl. figure 5).

2) Based on Seven Loci (DYS389I-13, DYS389 II-17, DYS390-25, DYS391-11,DYS392-11, DYS393-13 and DYS19-16), they observe that the Russians and Balkans reported by Klyosov have similar loci and age estimates as Pashtuns.

3) Haplogroup L-M20 originated in Pakistan rather than India based on higher haplotype variance.

4) And this is what stumped me at the end:We envision a plausible scenario in which the converted Khazarscould have been absorbed by the early Pathans and that R1a1a-M198drifted to high frequency in Afghanistan, with the Khazars being thecommon nexus between Ashkenazi Jews and Pathans. In addition, theJewish traditions (particularly circumcision, a talith prayer shawl,shabbat, praying in the direction of Jerusalem during the Day ofAtonement or Yom Kippur and the Magen David symbol in theirhouses, among several others)2 observed among Pathans fromAfghanistan and the presence of haplogroup G2c-M377, a lineagecommonly observed among the Ashkenazi Jewish population(B7%),42 are congruent with the above-stated hypothesis.

My purpose isn’t to treat here the origin of Ashkenazim, which I have expressed many times my thinking about, but the “outlier’s fallacy”

1The R1a1a haplotype I considered, and I calculated the MRCA by, is the witness that its place of origin is the place of origin of that haplogroup?It isn’t said, because the surviving of an outlier is by chance, due to Geographic an Historic reasons.Is the fact that India and Middle East lack the most ancient R1a-M420, present in Europe, the demonstration that this haplogroup was born in Europe and not in India or Central-South Asia?I should say not, because also the surviving of some samples of the most ancient subclades of a haplogroup may be due to Geographic and Historic reasons. This is the “outlier’s fallacy”.

2Then who calculated the variance of the Asian R1a, saying that it is older than the European one, may have committed the “outlier’s fallacy”, if, not considering my golden principles of the mutations around the modal etc., he hasn’t considered that there may be many outliers which make the variance older, and there may be more haplotypes survived, which shouldn’t be a criterion to determine the origin, because they too may be due to Geagraphic and Historic reasons.

3Then which is the criterion we should use to determine the place of origin of a haplogroup? Of course the aDNA could be the queen proof, but I don’t know if we’ll be able to gain it and overall. I’d say that the pathway of a haplogroup should be worth more and it is what I used. By an average calculation is less probable that the presence of the pathway of a haplogroup is due by chance.

I agree with the basic concept of and "outliers fallacy". I would characterize such a haplotype as one with a known SNP descendency, but whose markers are highly different than the modal for the set of similar SNP descendants. This is of personal interest to me because I have such a haplotype.

There are, at least, two ways such a haplotype might occur: 1. An unusual set of mutations at slow mutation sites. Very slow mutations can occur for any haplotype as I said on another thread, an entry in the Clan Gregor Ian Cam has had a mutation at 426 within the last 700 years. So one, itself, is not a measure of age. But what if one has 4 to 6 like I do? What is the probability of that occurring? For independent events it is the product of mutation rates or equivalently probability of mutation rates. With rates like 10^-4 or smaller the probability quickly becomes improbable. 2. The second is that the haplotype is a survivor of a bottleneck and is a very old haplotype. I believe that this is more probable than one under conditions of many low probability mutations?

The fundamental issue is how do we distinguish one from two? I don't think it can be done with genetic data only, it has to be supported by archaeology and other sciences. We know that disasters have occurred, many are considered as real, the Krakatoa event in the early 1800s per example, but others are considered to be between fact and fiction, e.g. , the extent of damage due to the Biblical flood.

I am afraid that at the present time that most people on these boards would consider "outliers" in the same context as "aliens from another planet", interesting but of no interest or merit.

Finally I am z5hg3 on the Y Search site. I would ask anyone who can explain how my haplotype can only be some 2000 years old or less to explain it to me.

Maybe we should form an "outliers" forum? I just hope more than two people sign up.

I agree with the basic concept of and "outliers fallacy". I would characterize such a haplotype as one with a known SNP descendency, but whose markers are highly different than the modal for the set of similar SNP descendants. This is of personal interest to me because I have such a haplotype....

Why do you think you are an outlier? Quoted below is a prior post where I've shown you the list of people closest related to McGregor / Z5HG3 from the L21 file.

If I consider all 149 of the 67 STR Z253+ haplotypes I have and look at the modal, I get you as a GD of 19. The average GD to the Z253 modal is 11.1 and the maximum is 26.

Admittedly, the Z253 modal is biased by the large number of L226 folks found from prior years. The average GD to the Z253xL226 modal is 13.9 and the maximum is 24 on a count of 62. Your GD to the Z253xL226 modal is 16. This is not substantially above the average GD to the modal.

I don't think you are an outlier in L226. You even have a small cluster, 252-1223, that you fit within so you are not all alone. What group are you saying you are an outlier for?

I looked at your age estimate for Z253 Mike, and I know it is consistent with your other work, but as I've noted before, I have a great deal of difficulty reconciling the haplotype I have z5hg3 (Ysearch) with your age estimates of this subclade. I know I don't fit the mold, but I have been tested positive for this SNP.

.... The others have no direct relationship to me that I am aware of. I have done a TMRCA between the Clan Gregor Moderator and myself and the Clan Gregor chieftain. I used two approaches: for the moderator I compared 59 STR''s and used Chandlers and the set of 110 rates from 2011. I got c. 200 AD for Chandlers rates and c. 550BC for the 110 data set. For the chieftain, I followed Busbys recommendation and only looked at the 13 slowest FtDNA markers which we differ at two: 388 and 495. I got a TMRCA of c. 13k BC!!

Is this your concern - your high TMRCA's with the Clan Gregor Moderator and Chieftain?

...My purpose isn’t to treat here the origin of Ashkenazim, which I have expressed many times my thinking about, but the “outlier’s fallacy”

1The R1a1a haplotype I considered, and I calculated the MRCA by, is the witness that its place of origin is the place of origin of that haplogroup?It isn’t said, because the surviving of an outlier is by chance, due to Geographic an Historic reasons.Is the fact that India and Middle East lack the most ancient R1a-M420, present in Europe, the demonstration that this haplogroup was born in Europe and not in India or Central-South Asia?I should say not, because also the surviving of some samples of the most ancient subclades of a haplogroup may be due to Geographic and Historic reasons. This is the “outlier’s fallacy”.

I'm not that familiar with R1a1a haplotypes. Is anyone else here able to converse on this? Perhaps you should make this point on Rootsweb

2Then who calculated the variance of the Asian R1a, saying that it is older than the European one, may have committed the “outlier’s fallacy”, if, not considering my golden principles of the mutations around the modal etc., he hasn’t considered that there may be many outliers which make the variance older, and there may be more haplotypes survived, which shouldn’t be a criterion to determine the origin, because they too may be due to Geagraphic and Historic reasons.

Are you talking about Anatole Klyosov? He is on the Rootsweb forum. I'm sure he will answer you have think his analysis is wrong.

3Then which is the criterion we should use to determine the place of origin of a haplogroup? Of course the aDNA could be the queen proof, but I don’t know if we’ll be able to gain it and overall. I’d say that the pathway of a haplogroup should be worth more and it is what I used. By an average calculation is less probable that the presence of the pathway of a haplogroup is due by chance.

I don't think we can use just one criterion and should probably have several critieria. Likewise, looking at a limited set of STRs and a few individual haplotypes could lead to misleading results.

I think you are saying that aDNA (ancient DNA) will provide proof, but I think we should consider that we are not necessarily looking for the trail of the SNP in and of itself. I think what we are trying to look for, at least I am, is how did the people alive today and in the historic timeframe get to where they are? In that case, the actual birth of an SNP is not really the most important issue. For example, it is possible that while R-M269 may have originated in Western Europe, the R-M269 descendants that eventually dominated Western Europe actually came from Southeast Europe or Southwest Asia. I'm not saying that is the case, I'm just trying to illustrate the concept of descendants of a most recent common ancestor versus the actual birth and placement of an SNP in ancient times.

Mikewww writes: “Perhaps you should make this point on Rootsweb […]Are you talking about Anatole Klyosov? He is on the Rootsweb forum. I'm sure he will answer you have think his analysis is wrong”.

Above all I thank you for your response. Re. Rootsweb, I was banned from there at the end of 2007, by an individual named Bullock, whom I think is now retired, but I don’t walk, usually, upon my steps. The vicissitude of the nickname “Claire”, given to me by Vernade on Dna-forums, was a mistake and I am happy that all (I say “all”) is over. About Anatole Klyosov I have corresponded with him many times and our positions are clarified. I respect of course all his work in many fields, but I think different.

About “What do you mean by the "pathway of a haplogroup [isn’t ] by chance?"” I do mean exactly that it could happen that a Jew G2c or R1a1a etc. descends from a Khazar converted or an Italian R-L23/L150+ (as I am) descends from an Eastern slave during the Roman Empire (I of course don’t think so, but it cannot be excluded), but that it is very unlikely that

R1b1* YCAII=18-22 and 18-23Mangino intermediate from R1b1* and R-M269*R-M269* YCAII=17-23R-M269* DYS462=12R-L23/L150- beside many L150 like mine and many different haplotypesR-L51+ 4% of all Italian malesR-P312* (2 out 3 found all over the world by Rich Rocca amongst the 1000 Genomes project)R-U152* and subclades with the highest percentage all over the world in Central-North Italy etc. etc.

.... an Italian R-L23/L150+ (as I am) descends from an Eastern slave during the Roman Empire (I of course don’t think so, but it cannot be excluded), but that it is very unlikely that

R1b1* YCAII=18-22 and 18-23Mangino intermediate from R1b1* and R-M269*R-M269* YCAII=17-23R-M269* DYS462=12R-L23/L150- beside many L150 like mine and many different haplotypesR-L51+ 4% of all Italian malesR-P312* (2 out 3 found all over the world by Rich Rocca amongst the 1000 Genomes project)R-U152* and subclades with the highest percentage all over the world in Central-North Italy etc. etc.

have come to Italy from elsewhere.

I think you are saying that the data you cite above means that R1b1 and R1b1a2 must have originated in Italy because of these these unusual STR findings and the frequency and findings of L23* and L51* folks. Is that correct?

Is the point that since these are outliers the outliers are actually the critical pieces of data that show us what happened?

I agree with the basic concept of and "outliers fallacy". I would characterize such a haplotype as one with a known SNP descendency, but whose markers are highly different than the modal for the set of similar SNP descendants. This is of personal interest to me because I have such a haplotype....

Why do you think you are an outlier? Quoted below is a prior post where I've shown you the list of people closest related to McGregor / Z5HG3 from the L21 file.

If I consider all 149 of the 67 STR Z253+ haplotypes I have and look at the modal, I get you as a GD of 19. The average GD to the Z253 modal is 11.1 and the maximum is 26.

Admittedly, the Z253 modal is biased by the large number of L226 folks found from prior years. The average GD to the Z253xL226 modal is 13.9 and the maximum is 24 on a count of 62. Your GD to the Z253xL226 modal is 16. This is not substantially above the average GD to the modal.

I don't think you are an outlier in L226. You even have a small cluster, 252-1223, that you fit within so you are not all alone. What group are you saying you are an outlier for?

I looked at your age estimate for Z253 Mike, and I know it is consistent with your other work, but as I've noted before, I have a great deal of difficulty reconciling the haplotype I have z5hg3 (Ysearch) with your age estimates of this subclade. I know I don't fit the mold, but I have been tested positive for this SNP.

.... The others have no direct relationship to me that I am aware of. I have done a TMRCA between the Clan Gregor Moderator and myself and the Clan Gregor chieftain. I used two approaches: for the moderator I compared 59 STR''s and used Chandlers and the set of 110 rates from 2011. I got c. 200 AD for Chandlers rates and c. 550BC for the 110 data set. For the chieftain, I followed Busbys recommendation and only looked at the 13 slowest FtDNA markers which we differ at two: 388 and 495. I got a TMRCA of c. 13k BC!!

Is this your concern - your high TMRCA's with the Clan Gregor Moderator and Chieftain?

First, I am 226-, Z253+. Over 110 dys loci, I have 27 differences from the modal. I have a real problem with the concept of GD. It treats all mutations as equivalent whereas we have a 100:1+ range of mutation rates. ( Aside: I worked for a guy once who gave attaboys and aw craps in response to goodwork or the opposite. On his scale 10 attaboys = 1 aw crap.) My point is that a mutation at 388, 395 and many of the tetra motif dys loci are worth 10 CDYa,b and other such fast mutators. (or even more).

The other implicit idea is that if my haplotype (I know its modern) may have had many hidden mutations which masks its true age. I say this because the period of half of the apparent mutations I have varies from 60,000 years to about 30,000 years.

I have often used a slot machine as a model for the appearance of mutations, where the house programs the big payoff as a rare event. Its not unusual to get a rare mutation but I assert it is unusual to have a string of them. There have to be a lot of pulls on the bar( read that meisoses) before the rare ones start to accumulate.

I have explained my relationship to turner. The others aren't under the Z253 like I am in the FtDNA R-L21 group. I will play around with their haplotypes to see what I get.

My relationship to the Ian Cam of Clan Gregor is very distant I believe, if you accept the multimigration out of Iberia concept, My ancestor may have been in the first group? re: the moderator, we may be from common stock c. 500 BC?

and when I compare my relative Giorgio Tognarelli (DK6NG) over 25 markers, many Mangums appear like the closest. My relative is like me R-L23+/L150+.What does this mean?

That a R-L21/predicted Z253 gets values similar to the haplotype of his ggggfather. This could mean many things:1)that mutations go and come around a modal2)that these values, considered far from the “modal” (but it is the modal of the majority and nothing warrants it is the modal of the ancestor who had the SNP L21), could be the closest to the “modal” and all the others the furthest.

and when I compare my relative Giorgio Tognarelli (DK6NG) over 25 markers, many Mangums appear like the closest. My relative is like me R-L23+/L150+.What does this mean?

That a R-L21/predicted Z253 gets values similar to the haplotype of his ggggfather. This could mean many things:1)that mutations go and come around a modal2)that these values, considered far from the “modal” (but it is the modal of the majority and nothing warrants it is the modal of the ancestor who had the SNP L21), could be the closest to the “modal” and all the others the furthest.

I 'll check Mangum out. Mike mentioned him as being part of a cluster. You may ask how could it be possible that 393 = 12, might be the modal of R-L21? I have thought quite a bit about that and here is a summary of what might have occurred. Some of the older datasets that VV has assembled show 12 as the 393 modal, but it is usually called an "eastern" not european haplotype. The point is that at one point in time 12 was the modal for some R1b haplotypes. However these haplotypes are considered to be much earlier in the R1b tree than R-L21.

If we postulate a bottleneck occurred when the modal was 12, and the population was decimated and the remaining people literally "headed for the hills", such as the Alps, the highlands of Scotland, wherever it was safe (from the flood I assume) we may explain why, when the population began to grow that 13 became prevalent. Apparently, the 12 remainders stayed in the highlands while the 13 expanded back into the more fertile lowlands. They then showed the remarkable the growth of their haplotype vice the 12 due to the availability of food and nutrients. This latter question of why 12 didn't grow after things got better still bothers me and needs some explanation - this is the best scenario I have come up with?

It is possible that what you say has happened, but to be sure we should reconstruct step by step each value. When I speak of mutations around the modal I do mean that your DYS393 could be 12 for R-L23, 13 for R-L21, 12 for your haplotype and so on, and it is also possible that your haplotype has maintained the original 12. I think that many markers have had more than two or three mutations around the modal, for instance DYS391, which rotates around 10 and 11, except when some value goes for the tangent.DYS393 isn’t so slow mutating like DYS392. My value isn’t the usual 13 also for R-L23, but mine is 12. I am sure that this mutation has happened in these last centuries, not more than 5, because a relative of mine, Giancarlo Tognoni (we have a common ancestor documented by paper trails in the 15th century), has 13.

I agree with the basic concept of and "outliers fallacy". I would characterize such a haplotype as one with a known SNP descendency, but whose markers are highly different than the modal for the set of similar SNP descendants. This is of personal interest to me because I have such a haplotype....

Why do you think you are an outlier? Quoted below is a prior post where I've shown you the list of people closest related to McGregor / Z5HG3 from the L21 file.

If I consider all 149 of the 67 STR Z253+ haplotypes I have and look at the modal, I get you as a GD of 19. The average GD to the Z253 modal is 11.1 and the maximum is 26.

Admittedly, the Z253 modal is biased by the large number of L226 folks found from prior years. The average GD to the Z253xL226 modal is 13.9 and the maximum is 24 on a count of 62. Your GD to the Z253xL226 modal is 16. This is not substantially above the average GD to the modal.

I don't think you are an outlier in L226. You even have a small cluster, 252-1223, that you fit within so you are not all alone. What group are you saying you are an outlier for?

I looked at your age estimate for Z253 Mike, and I know it is consistent with your other work, but as I've noted before, I have a great deal of difficulty reconciling the haplotype I have z5hg3 (Ysearch) with your age estimates of this subclade. I know I don't fit the mold, but I have been tested positive for this SNP.

.... The others have no direct relationship to me that I am aware of. I have done a TMRCA between the Clan Gregor Moderator and myself and the Clan Gregor chieftain. I used two approaches: for the moderator I compared 59 STR''s and used Chandlers and the set of 110 rates from 2011. I got c. 200 AD for Chandlers rates and c. 550BC for the 110 data set. For the chieftain, I followed Busbys recommendation and only looked at the 13 slowest FtDNA markers which we differ at two: 388 and 495. I got a TMRCA of c. 13k BC!!

Is this your concern - your high TMRCA's with the Clan Gregor Moderator and Chieftain?

First, I am 226-, Z253+. Over 110 dys loci, I have 27 differences from the modal. I have a real problem with the concept of GD. It treats all mutations as equivalent whereas we have a 100:1+ range of mutation rates. ( Aside: I worked for a guy once who gave attaboys and aw craps in response to goodwork or the opposite. On his scale 10 attaboys = 1 aw crap.) My point is that a mutation at 388, 395 and many of the tetra motif dys loci are worth 10 CDYa,b and other such fast mutators. (or even more).

The other implicit idea is that if my haplotype (I know its modern) may have had many hidden mutations which masks its true age. I say this because the period of half of the apparent mutations I have varies from 60,000 years to about 30,000 years.

I have often used a slot machine as a model for the appearance of mutations, where the house programs the big payoff as a rare event. Its not unusual to get a rare mutation but I assert it is unusual to have a string of them. There have to be a lot of pulls on the bar( read that meisoses) before the rare ones start to accumulate.

I have explained my relationship to turner. The others aren't under the Z253 like I am in the FtDNA R-L21 group. I will play around with their haplotypes to see what I get.

My relationship to the Ian Cam of Clan Gregor is very distant I believe, if you accept the multimigration out of Iberia concept, My ancestor may have been in the first group? re: the moderator, we may be from common stock c. 500 BC?

I think you know this but Z253+ is downstream of L21 so you are more closely related to everyone within Z253+ than to anyone who is Z253- but is L21+. The same goes for everyone who is L21-. You are not closely related to them so it is waste of time to do comparisons with them.

The Irish III (L226+) and Irish IV guys are also Z253+.

Within the Z253 haplogroup, your haplotype does NOT appear exceptionally unusual. Having 393=12 is not that rare, generally speaking. Within R-L21 the frequencies I count are below so there are 78 other people in L21 that are 393=12.393=15 - 3393=14 - 152393=13 - 3239393=12 - 79393=11 - 1

However, I think it is of benefit to look at your whole haplotype so that one STR doesn't throw you off. If you don't like GD's I'd suggest that you also consider two other factors in context with GD's. Please consider your haplogroup so there is no need to cross-check TMRCAs with someone who is Z253-. The other thing is to look for a common STR signature. I've identified an STR signature that I labeled 253-1223 that you fit into. I showed in prior post the six people in it. Your GD at 67 to the modal for the 253-1223 variety is only 4.

However, I think there is a fair chance that three of the six people in your variety don't really fit. I recommend you get the other five guys to test for Z253 to see how it really shakes out.

Quote from: Ironroad

My point is that a mutation at 388, 395 and many of the tetra motif dys loci are worth 10 CDYa,b and other such fast mutators

I know this is not necessarily intuitive, but a mutation to any STR, no matter how slow, could have happened just yesterday (with our generation.) This is why it is valuable to look at many STRs and look at those many STRs across as many people as you can within your haplogroup.

There are people who win big on a slot machine pull. It probably won't happen to me, but it does happen. It is not uncommon at all for L21 people to have off-modal values at slow markers.

I know you are interested in DYS393. It is the 24th slowest marker of FTDNA's first 67 according the published rates that hobbyists are using (that link you cited.) I just checked the 4697 L21 confirmed 67 length haplotypes that I have. 1100 of them are modal (matching L21) for all 24 of the slowest markers. This means about 3/4th's of L21 has at least one off-modal on the slowest 24 and there are many with two or three.

I think you know this but Z253+ is downstream of L21 so you are more closely related to everyone within Z253+ than to anyone who is Z253- but is L21+. The same goes for everyone who is L21-. You are not closely related to them so it is waste of time to do comparisons with them.

The Irish III (L226+) and Irish IV guys are also Z253+.

Within the Z253 haplogroup, your haplotype does NOT appear exceptionally unusual. Having 393=12 is not that rare, generally speaking. Within R-L21 the frequencies I count are below so there are 78 other people in L21 that are 393=12.393=15 - 3393=14 - 152393=13 - 3239393=12 - 79393=11 - 1

However, I think it is of benefit to look at your whole haplotype so that one STR doesn't throw you off. If you don't like GD's I'd suggest that you also consider two other factors in context with GD's. Please consider your haplogroup so there is no need to cross-check TMRCAs with someone who is Z253-. The other thing is to look for a common STR signature. I've identified an STR signature that I labeled 253-1223 that you fit into. I showed in prior post the six people in it. Your GD at 67 to the modal for the 253-1223 variety is only 4.

However, I think there is a fair chance that three of the six people in this variety don't really fit. I recommend you get the other five guys to test for Z253 to see how it really shakes out.

For what its worth, McGregor, I looked at Mikew L21Ext sheet and selecting only Z253 tested Hts (n=32) here is what I see is that your closest 111 marker Z253 kit is a GD of 29 up to 50 (TMRCA range of 1784 to 3075 years). Most of the markers that you are different are in the 68-111 panel.

No one close to you. Outlier your are until others are found and are Z253 positive. Keep hunting. MJost

35981 is the GDtarg showing those from GD29 up to GD39. Kit/Name/HG/Varity/GD@111/Std1-25/439/459/DYS464/Std26-37/YCAII/CDY/Std38-67/425/413/Std68-111/TMRCA

Thanks for the research MJost. I appreciate it. mostimes when I do a TMRCA, I cannot use 111 and look at the full set of dys loci.

On a separate thread I just reiterated to Mike my feeling about our present ability to estimate TMRCA's. I think the issue is hidden mutations as time increases and the probability of two mutations at one loci can occur. CDYa,b in the clan Gregor Ian Cam is interesting because there are obviously multiple mutations at these dys loci within the time frame of interest, and they can be inferred from the data (not necessarily hidden). Further, as I explained to mike, in general these, at many dys loci, are really hidden mutations that the Var/ASD model can't handle/compensate for

I have confidence in analyses less than 1K years, such as the Ian Cam. I have also taken the modals of several of the clans which converge about 1100 AD to 1300 AD and extended them back in time another 1K years. Beyond this, I think all are estimates are too small. At the present time, I have no good way to get around this problem. JMHO.

Looks like you are not closely related to the Ian Cam because you are not a descendant of the founder of Clan Gregor. I quote Wikipedia: It is a common misconception that every person who bears a clan's name is a lineal descendant of the chiefs. Many clansmen although not related to the chief took the chief's surname as their own to show solidarity, or for basic protection, or for much needed sustenance.

I see no McGregors in Irish Type 4/Cont, which seems to be the equivalent of Z253+, L226-.

I am what is known as a partaker. My ancestors were adopted into the clan and assumed the name. It is very clear that I am not An Ian Cam descendant and neither is the clan moderator.

There is some "discussion" re: is the name of the clan "Gregor", not MacGregor or McGregor, older than the Ian Cam. There are many Gregory and McGregorys in the clan and name Gregory/Gregor may be older than the Clan. The grandson of King Alpin was Giric and some feel that may be the source of the name. Others ascribe it to many Popes who were Gregory. I don't really know the real source of the name, the Ian Cam are adamant that it was first used when their line started. FWIW

The name will surely be older than the clan as a personal name. It is one of those cases where a surname evolved from a personal name. That couldn't happen if the personal name didn't exist earlier. Pope Gregory I (c. 540 – 604) was revered in England for his mission to the Anglo-Saxons. He was canonised, so no doubt some boys in England and Scotland were named after him because they were born on his feast-day.

There could have been a number of people who had the name McGregor because their father was called Gregor/Gregory. So not every McGregor will necessarily be related to each each other. That could be another explanation for your surname, unless you know for sure that your family was adopted into the clan.

Either way it seems that you have reached a conclusion that you are not closely related to the Ian Cam, so comparisons with that lineage are not informative. It is really helpful that you now have an SNP fitting you into a subclade of L21 and a very interesting one too. I took a look at the surnames in the Irish IV/Continental group. It is very varied.

Re: the name, I think your points are well made, but when it comes to clan issues we are dealing with a lot of pride and every attempt is made to ascertain clan descendancy.

My little personal joke is that my ancestors welcomed the Scottis ashore, after they (my ancestors) had cleaned the Romans clock at Mons Graupius.

To date I am only looking at measured Z253+ to ascertain relationships. But that brings Iberia in very quickly and reopens a lot of questions that many had assumed were closed.

Aside: I notice your emphasis on Copper mining as important to defining early trade routes. Are you aware of the copper that was taken from Isle Royale, in the middle of lake Superior, by prehistoric miners? This copper is called "native copper", because its only impurity was Silver. It has been found all over North America. Who the miners were is not definitively known?

Ah! I see a couple of names from Spain in your group in the Z253 project. The offspring of Peninsula War soldiers maybe? Or could it go back to Bell Beaker? Or the Atlantic Bronze Age?

It is an interesting subclade overall. "Z253 was initially discovered in two anonymous participants of the 1000 Genomes Project, namely samples HG01136 (Colombian) and NA19717 (Mexican-American). Z253 has since been found in FTDNA samples with ancestry from England, France, Ireland, Norway, Scotland, Spain and Switzerland."

It seems clear that the tendency to mutate of DYS393 in R-L21 is the double forwards than backwards, i.e. every mutation from 13 to 12 there are 2 mutations from 13 to 14.Then we should conclude that there have been more back mutations from 12 to 13 than from 14 to 13, but, if we consider that there are 3 at 15 and 1 at 11, we should consider (if we may extrapolate, and it isn’t said) that the mutations forwards are 3 times than backwards.We have these percentages:393=15 – 3 = 0,086393=14 – 152 = 4,375393=13 – 3239 =93,235393=12 – 79 = 2,274393=11 – 1 = 0,028

These are the percentages of the values tested by SMGF for this marker:Alleles Samples Frequency Percentage8 2 0,00006 0,006%9 2 0,00006 0,006%10 13 0,00036 0,036%11 100 0,00278 0,278%12 3393 0,09440 9,440%13 25512 0,70981 70,981%14 5196 0,14457 14,457%15 1627 0,04527 4,527%16 94 0,00262 0,262%17 3 0,00008 0,008%Grand Total 35942 1 100%Of course the data aren’t homogeneous, because the people tested by SMGF belong to many haplogroups, but we can see a tendency, that the value of DYS393 was 13 also at the origin of our ancestor, and 13 is now at 70,981%.Then probably on average, and made all the due caveats, the age of R-L21 is about 4,289 times less than that of BT. If we presuppose that BT is 70,000 years old, then R-L21 is 16,320.Of course I have said that we need many caveats, but it seems that R-L21 cannot be so young like many are thinking.

I'm not trying to be vicious, but you are taking us steps backwards in in arguing by exception (using only one STR and only one haplogroup) and by tremendously oversimplifying probability theory and population genetics statistical models.

It seems clear that the tendency to mutate of DYS393 in R-L21 is the double forwards than backwards, i.e. every mutation from 13 to 12 there are 2 mutations from 13 to 14.

Are you arguing that different haplogroups have different "expected" mutation rates?The ratio of presently FTDNA tested R-L21 393=14 to 393=12 is 1.9, or close to double. I don't think this is a useful number as we have no reason to think the L21 SNP mutation had any effect on DYS393 STR diversity. If so, why limit your sample size when you have many more haplogroups than just L21? The numbers you pulled from SGMF show a 1.5 ratio... which is about 25% less.

... Of course the data aren’t homogeneous, because the people tested by SMGF belong to many haplogroups, but we can see a tendency, that the value of DYS393 was 13 also at the origin of our ancestor, and 13 is now at 70,981%.Then probably on average, and made all the due caveats, the age of R-L21 is about 4,289 times less than that of BT. If we presuppose that BT is 70,000 years old, then R-L21 is 16,320.Of course I have said that we need many caveats, but it seems that R-L21 cannot be so young like many are thinking.

I think I'm following you. 93.236% of all R-L21 in my sample is 393=13. 70.981% of everybody in the SMGF sample is 393=13. That means the R-L21 people who are off-modal are 6.764% and the all haplogroups off-modal total is 29.019%. If you divide 29.019 by 6.764 you get the ratio of about 4.289 to 1 which you then apply to the estimated age of BT.

Is this some kind of short-hand math for population variance estimation? Statistical models bring value by conducting many "experiments" across many STRs and many haplogroups. The value is in analyzing the whole of the data and to refine accuracy and define confidence ranges according to a documented model.

I'm not saying you are concerns about back-mutations or erratic STRs are unreasonable, just that you aren't logically demonstrating there is a problem with thoughtfully applying STR diversity to time estimates, even if they aren't precise.

You seem to really like to argue by exception. Inferring by exception on top of assumptions is not necessarily conducive to useful conclusions.