...I would like to emphasize one other aspect of the Goldstein derivation in which he states that each dys loci can be used to infer the TMRCA but in practice several are used and averaged. Note: I do not believe this calculation can be made using Kens approach since he uses averages of mutation rates? ...

If you are critiquing Ken Nordtvedt's TMRCA methodology you should probably read his web site documentation and understand his spreadsheet. http://knordtvedt.home.bresnan.net/ You can also get direct answers from him on the Rootsweb Hg I forum. He'll answer, particularly if you have a critique.

I've seen where Anatole Klyosov uses an average rate across a set of markers. Nordtvedt aggregates STRs into a summary TMRCA but he does call them individual experiments and he does use the individual STR mutation rates in his spreadsheet formulas. He has a column for each STR. Anyway, I don't think this is averaging the rates together in the sense that you mean, but I'm not sure what you mean. I think when you get down to the specifics you have to talk about the details of the formulas.

...I would like to emphasize one other aspect of the Goldstein derivation in which he states that each dys loci can be used to infer the TMRCA but in practice several are used and averaged. Note: I do not believe this calculation can be made using Kens approach since he uses averages of mutation rates? ...

If you are critiquing Ken Nordtvedt's TMRCA methodology you should probably read his web site documentation and understand his spreadsheet. http://knordtvedt.home.bresnan.net/ You can also get direct answers from him on the Rootsweb Hg I forum. He'll answer, particularly if you have a critique.

I've seen where Anatole Klyosov uses an average rate across a set of markers. Nordtvedt aggregates STRs into a summary TMRCA but he does call them individual experiments and he does use the individual STR mutation rates in his spreadsheet formulas. He has a column for each STR. Anyway, I don't think this is averaging the rates together in the sense that you mean, but I'm not sure what you mean. I think when you get down to the specifics you have to talk about the details of the formulas.

I left my statement re: Kens approach as a question mark, since I haven't looked over his work in quite a while. If he uses individual dys loci rates then his approach should be amenable to to the same SD calculation. My major point was that if the rates of the loci are similar, then the estimates are closer and the SD is smaller.

...I would like to emphasize one other aspect of the Goldstein derivation in which he states that each dys loci can be used to infer the TMRCA but in practice several are used and averaged. Note: I do not believe this calculation can be made using Kens approach since he uses averages of mutation rates? ...

If you are critiquing Ken Nordtvedt's TMRCA methodology you should probably read his web site documentation and understand his spreadsheet. http://knordtvedt.home.bresnan.net/ You can also get direct answers from him on the Rootsweb Hg I forum. He'll answer, particularly if you have a critique.

I've seen where Anatole Klyosov uses an average rate across a set of markers. Nordtvedt aggregates STRs into a summary TMRCA but he does call them individual experiments and he does use the individual STR mutation rates in his spreadsheet formulas. He has a column for each STR. Anyway, I don't think this is averaging the rates together in the sense that you mean, but I'm not sure what you mean. I think when you get down to the specifics you have to talk about the details of the formulas.

I left my statement re: Kens approach as a question mark, since I haven't looked over his work in quite a while. If he uses individual dys loci rates then his approach should be amenable to to the same SD calculation. My major point was that if the rates of the loci are similar, then the estimates are closer and the SD is smaller.

Okay, so you are not critiquing Ken's methodology then, because you haven't read his work for quite a while. Since you are mentioning him by name and using hypotheticals like "if he uses" then to be fair to him why don't you challenge him directly? If you feel uncomfortable, if you will craft a set of very specific questions, I'll ask them on the Hg I Rootsweb forum so he will answer. That way the questions are somewhat anonymous from your perspective.. The guy is good with math so I doubt if he hasn't spent a lot of time on the issues related to this.

I'm just cataloging this from the Busby thread since Busby did an analysis of the linear duration of STRs, which somewhat questions the concept, but then seems to rely on them (STRs) to make their case about various forms of R1b in Europe.

I've also agree that STR evaluation is useful. I just think that using limited numbers like 10 or 15 is not enough. That's what I see when I do my own comparisons on hundreds of long haplotypes anyway. I also think Busby's application of STRs does not match their own linear duration standards. That is an attack, but perhaps I just don't understand. Can you explain?

You are right; they showed that there is a significant effect of microsatellite choice in age estimates that they should have used that finding when calculating TMRCA of R-S127 haplogroup which is on figure-4a. However, in figure-2 they did not calculate TMRCA in generations, but explored the bootstrapped variance, and in fact they do not seem to think that variance is affected by choice of STR, which is why they used 10 STRs on figure-2. In a nutshell they showed that microsatellite choice can have an effect on age estimates, but still used a combined set of 10 STRs to explore variance. Perhaps they think one should choose the STRs when calculating TMRCA based on similarity on mutations rates and the presumed time span for common ancestry, i.e. use the average mut/marker for the slowest or fastest STRs depending on the presumed TMRCA, but not the average mut/marker for the whole set, but if you want to calculate variance use the combined set of STRs.

This where I get confused about Busby's theme. I don't know really understand which methods they think are best, but at least I see they value STR diversity in their analyses, just using different techniques I guess.

This is always a contentious issue. I think STR diversity is useful. There are challenges and they must be considered in context.

In my opinion, people are fine with it until it disagrees with their theory, then they must shoot it down rather than adjust their theory. To me it is just another data point, and unfortunately we are in dire need of those.

Anyway, let's discuss this topic here so we don't have to argue the points over and over again in other topics, drowning them out.

As always seems to happen we have strayed from your original question and can't see the forest for the trees (or something like that). My observations have been little discussed. My major point in answering you is that I do not believe most Y STR dys loci follows a drunkards walk model which is mathematically equivalent to using ASD/Variance to describe the process. I know that Nordtvedt is using Variance but my reference for that derivation has been Goldstein,et.al. ( who by-the-way heads up the human genome lab at Duke Univ.). I believe, based on analyzing the data set I referenced that his model does match the data. I'm not throwing rocks at anyone, he had no data! 1. No distribution of allele values around the modal for the set of dys loci. 2. No knowledge of multisteps. When you include these factors I have to conclude that the model doesn't work.

Additionally, the data also suggests that if many of the dys loci mutate away from their modal, then the most probable next mutation is back to the modal, because except for the 5% of multisteps, their aren't any entries with values greater than +/- of themodal.

so to bluntly anwer your original question I would say that diversity isn't meaningful since its masked by hidden mutations which makes time shorter, we count less mutations than really occurred and I don't think ASD/Variance can handle that. (note, the original statement re ASD/Var compensating for hidden mutations was based on the drunkards walk model, where the distance from the modal increase with time and the squaring of the difference between the modal and the present value does compensate for back mutations)

I would be very interested in seeing some data from existing R1b data sets re: STR locus distributions around the modal. I simply don't have the math tools to extract that from the datasets myself.

My understanding of the explanation is that their mathematical model does not care about hidden mutations or even multi-step mutations. The mutation rates were derived based on visible mutations so, as long as they have adequate data to build the mutation rates, the way the TMRCA method uses them is consistent. We should not think of the published mutation rate as literally the physical rate of change per the STR, but rather the observable rate of change.

What is required is that the STRs act somewhat consistently, in other words the expected (predicted) rates up and down should be the same and the rates shouldn't change given the allele value, etc. This would be where the concern about STRs reaching saturation and high alleles values comes into play. If an STR doesn't show linear duration (of its rate) during the timeframe we care about then it is not helpful. The goal of the math model is to include STRs that are linear or "on average" (in aggregate) linear.

... My major point in answering you is that I do not believe most Y STR dys loci follows a drunkards walk model which is mathematically equivalent to using ASD/Variance to describe the process. I know that Nordtvedt is using Variance but my reference for that derivation has been Goldstein,et.al. ( who by-the-way heads up the human genome lab at Duke Univ.). I believe, based on analyzing the data set I referenced that his model does match the data. I'm not throwing rocks at anyone, he had no data! 1. No distribution of allele values around the modal for the set of dys loci. 2. No knowledge of multisteps. When you include these factors I have to conclude that the model doesn't work...

I haven't read Goldstein's report. Would you mind posting it again?

All I can say is that it is apparent that when looking at R1b haplogroup haplotypes... real ones, lots of them and long ones ... that STR diversity generally increases with haplogroups that are bigger (older) branches on the Y DNA tree. In other words, it actually happens STR variance is higher for haplogroups that the SNP based Y DNA tree says are older. - This is observable. Not hypothetical. Please check reply #72 in this thread and around it. I've done this for pretty much all of R-L11. It works nicely.

Is STR variance precise? No, but folks like Nordtvedt take great pains to produce confidence ranges that you can use and used advanced techniques like interclade comparisons to improve precision.

Academics and testing companies also use STR diversity and have been for a long time.

I know you are aware of Marko Hienila's TMRCA method. He said it is NOT ASD/variance based so that might alleviate your fears. He calls it a "maximum likelihood" method which I believe is especially well suited for back or multi-step mutations.....but it matters little. Marko comes up with TMRCAs for the R1b haplogroups that are similar to what Nordtvedt's method does.

Are all STRs good in terms of their linearity with time? No, surely not. The multi-copy ones aren't very linear at all. Some of the faster ones, or at least the high allele value ones may not be reliable either.

Is it possible that some samples of haplotypes are biased by a particular group? Sure, that is what the "resampling" thing is all about in the Busby and Myres work. However, this is primarily an intraclade problem. Nordtvedt's interclade approach can reduce or eliminate those biases significantly.

Maybe the mutation rates are all wrong, but I don't think anyone can effectively argue that most of FTDNA's STRs don't accumulate variance with time. It's also intuitive, if you consider that most of these STRs are single steppers per event and you overlay that on to the family structure (tree).

My understanding of the explanation is that their mathematical model does not care about hidden mutations or even multi-step mutations. The mutation rates were derived based on visible mutations so, as long as they have adequate data to build the mutation rates, the way the TMRCA method uses them is consistent. We should not think of the published mutation rate as literally the physical rate of change per the STR, but rather the observable rate of change.

What is required is that the STRs act somewhat consistently, in other words the expected (predicted) rates up and down should be the same and the rates shouldn't change given the allele value, etc. This would be where the concern about STRs reaching saturation and high alleles values comes into play. If an STR doesn't show linear duration (of its rate) during the timeframe we care about then it is not helpful. The goal of the math model is to include STRs that are linear or "on average" (in aggregate) linear.

Thanks Mike.

What about using a Poisson distribution process to help gauge how many hidden mutations are accumulated over time? For example, Let's say the average observable genetic distance between any two L11+'s is 20. Poisson should show us how many should be the average at x point in time. Maybe 30 at 6000 years, 40 at 8000, or only a small increase.

I ran a simple Poisson distribution with Excel using an average mutation rate of .0023 and average generation time of 30 years/G over 49 markers. This is to see how many mutation events can be expected in x time between two haplotypes.

For 67 generations or 2000 years, I get 7 mutations with the probability mass function. At 10,000 years, 37 mutations with the same.

This hypothetically includes hidden mutations. Many L11 members are 20+ away from others in observable mutations, so approximately 37 on average when including back or multi-step mutations might not be far off. However, this is still a simple model for what we are trying to answer and the snp L11 is probably closer to 2,000 than 10,000 years old.

My understanding of the explanation is that their mathematical model does not care about hidden mutations or even multi-step mutations. The mutation rates were derived based on visible mutations so, as long as they have adequate data to build the mutation rates, the way the TMRCA method uses them is consistent. We should not think of the published mutation rate as literally the physical rate of change per the STR, but rather the observable rate of change.

What is required is that the STRs act somewhat consistently, in other words the expected (predicted) rates up and down should be the same and the rates shouldn't change given the allele value, etc. This would be where the concern about STRs reaching saturation and high alleles values comes into play. If an STR doesn't show linear duration (of its rate) during the timeframe we care about then it is not helpful. The goal of the math model is to include STRs that are linear or "on average" (in aggregate) linear.

I didn't see the term multisteps discussed by John? I do note that when he refers to compensation for hidden mutations he is making reference to Dys loci that behave like a drunkards walk model and are unbounded. HIs comment about the linearity of a dys loci is appropriate and I believe the number of mutations is undercounted because of the boundedness of many of the dys loci.

I llike his presentation of the Zhiv problem and how they found a constant fudge factor to compensate for some unknown factor in the mutational process. I happen to believe the unknown factor is real and is related to the hidden mutation issue.

You don't need the fudge factor if you can intelligently count mutations, when you can't then maybe it is the best option when you're trying to infer Large TMRCA's.

... What about using a Poisson distribution process to help gauge how many hidden mutations are accumulated over time? For example, Let's say the average observable genetic distance between any two L11+'s is 20. Poisson should show us how many should be the average at x point in time. Maybe 30 at 6000 years, 40 at 8000, or only a small increase.

I don't know the statistics well enough comment on the advantages or disadvantages. I know the "Maximum Likelihood" method that Marko Heinila uses can be applied to a Poisson distribution but I don't know have any details on Marko's formulas. He might have them posted somewhere.

John Chandler would probably comment if you post this on Rootsweb GENEALOGY-DNA.

... My major point in answering you is that I do not believe most Y STR dys loci follows a drunkards walk model which is mathematically equivalent to using ASD/Variance to describe the process. I know that Nordtvedt is using Variance but my reference for that derivation has been Goldstein,et.al. ( who by-the-way heads up the human genome lab at Duke Univ.). I believe, based on analyzing the data set I referenced that his model does match the data. I'm not throwing rocks at anyone, he had no data! 1. No distribution of allele values around the modal for the set of dys loci. 2. No knowledge of multisteps. When you include these factors I have to conclude that the model doesn't work...

I haven't read Goldstein's report. Would you mind posting it again?

All I can say is that it is apparent that when looking at R1b haplogroup haplotypes... real ones, lots of them and long ones ... that STR diversity generally increases with haplogroups that are bigger (older) branches on the Y DNA tree. In other words, it actually happens STR variance is higher for haplogroups that the SNP based Y DNA tree says are older. - This is observable. Not hypothetical. Please check reply #72 in this thread and around it. I've done this for pretty much all of R-L11. It works nicely.

Is STR variance precise? No, but folks like Nordtvedt take great pains to produce confidence ranges that you can use and used advanced techniques like interclade comparisons to improve precision.

Academics and testing companies also use STR diversity and have been for a long time.

I know you are aware of Marko Hienila's TMRCA method. He said it is NOT ASD/variance based so that might alleviate your fears. He calls it a "maximum likelihood" method which I believe is especially well suited for back or multi-step mutations.....but it matters little. Marko comes up with TMRCAs for the R1b haplogroups that are similar to what Nordtvedt's method does.

Are all STRs good in terms of their linearity with time? No, surely not. The multi-copy ones aren't very linear at all. Some of the faster ones, or at least the high allele value ones may not be reliable either.

Is it possible that some samples of haplotypes are biased by a particular group? Sure, that is what the "resampling" thing is all about in the Busby and Myres work. However, this is primarily an intraclade problem. Nordtvedt's interclade approach can reduce or eliminate those biases significantly.

Maybe the mutation rates are all wrong, but I don't think anyone can effectively argue that most of FTDNA's STRs don't accumulate variance with time. It's also intuitive, if you consider that most of these STRs are single steppers per event and you overlay that on to the family structure (tree).

The Goldstein/Stumpf paper is from Science, Vol. 191, 2 march 2001

I would expect diversity of a set of haplotypes to increase with time. As time elapses more to the slower mutations occur which have a very small probability of reoccurring. I think the medium rate haplotypes (mostly tetra motif) go in and out randomly as they mutate around the modal?

Markko traces/uses apparent mutations as does Ken.

I am arguing that most tetra motif dys loci don't accumulate variance with time. Variance increases requires an unbounded model. I don't see that in the small amount of data I have looked at?

What tetra STR markers out of FTDNA's first 67 should be eliminated. Please provide the list. It should be easy to run a couple of comparisons. Maybe this will line up with Marko Heinila's linear duration analysis in which case the "36 linear" markers that I use will be appropriate.

The set you probably should use depends on the time frame of interest. This was Busbys observation, but not practice if I read his paper correctly. Its a probability issue. For independent events, as mutations are, the probability of two mutations at a loci is equal to the P(1) mutation squared. I don't have a good rule for picking, I observe, whatever their rates are, that CDYa,b can have more than one mutations per entry in a relative short time, hundreds of years. Maybe you can scale from their rate to estimate which dys loci have a low probability of two mutations in 1K years and so on?

When I say bounded I mean that (excepting multisteps), the mutational process at a dys loci is bounded/confined to modal +/-1.

Our findings suggest that Y chromosome STRs of increased repeat unit size have a lower rate of evolution, which has significant relevance in population genetic and evolutionary studies.

Principal FindingsIn order to study the evolutionary dynamics of STRs according to repeat unit size, we analysed variation at 24 Y chromosome repeat loci: 1 tri-, 14 tetra-, 7 penta-, and 2 hexanucleotide loci. According to our results, penta- and hexanucleotide repeats have approximately two times lower repeat variance and diversity than tri- and tetranucleotide repeats, indicating that their mutation rate is about half of that of tri- and tetranucleotide repeats.'

I ask you for some detail so I can modify my variance calculations and look at STRs you think are appropriate. I'm volunteering to do this for you. I don't really think this effort is going to lead to anything, but I'm willing to test your argument with real data like I have on Marko's "linear markers" or Ken's idea of "more markers is better except multi-copy, etc."

What tetra STR markers out of FTDNA's first 67 should be eliminated? Please provide the list. It should be easy to run a couple of comparisons. Maybe this will line up with Marko Heinila's linear duration analysis in which case the "36 linear" markers that I use will be appropriate.

Below is your answer. My request to help you is simple but you are not helping me help you.

The set you probably should use depends on the time frame of interest. This was Busbys observation, but not practice if I read his paper correctly. Its a probability issue. For independent events, as mutations are, the probability of two mutations at a loci is equal to the P(1) mutation squared. I don't have a good rule for picking, I observe, whatever their rates are, that CDYa,b can have more than one mutations per entry in a relative short time, hundreds of years. Maybe you can scale from their rate to estimate which dys loci have a low probability of two mutations in 1K years and so on?

When I say bounded I mean that (excepting multisteps), the mutational process at a dys loci is bounded/confined to modal +/-1.

You don't have to agree with the results, but please provide specifics on your argument so it can be tested in some manner.

I think we've gone over this, but CDYa,b are multi-copy markers and no one that I know of uses them in TMRCA calculations. They are already excluded from the argument. I exclude DYS385, YCAII, DYS464, DYS459, DYS413, DYS395s1, DYS425 (possible null), DYS439 (possible null) in any of my STR variance calculations. I do include those on straight GD calculations using modified infinite allele techniques.

I have played with adding and subtracting STRs and comparing relative variance across haplogroup. I've done this more systematically with the linearity estimates Marko Heinila has provided. I can tell you, it doesn't make much difference as long as you get enough STRs (individual experiments) going. The benefits of the law of large numbers seems to apply.

I am not going to extra research and gyrations unless you can be specific on what you want to test and do your own homework. Do you want to improve the processes? or you just don't like the answers?

Our findings suggest that Y chromosome STRs of increased repeat unit size have a lower rate of evolution, which has significant relevance in population genetic and evolutionary studies. ...

umm... this is making a litte more sense to me in terms of the academic back and forth.

"Decreased Rate of Evolution in Y Chromosome STR Loci of Increased Size of the Repeat Unit" by Jarve also includes Zhivotovsky as an author. Zhivotovsky is the guy who gets his name hung on as the label for the famous (or infamous) evolutionary mutation rates. I should go try to find Nordtvedt's Rootsweb posts. He really just plain calls the Zhivotovsky evolutionary rates bad science. That's another side discussion, but it would make sense that given criticism, Zhivotovsky would need to go out and find some bad STRs to help support what some people call his times 3 fudge factor.

Nevertheless, some STRs probably do behave non-linearly outside of certain time ranges. Marko Heinila addressed this with a statistical analysis across tens of thousands of haplotypes. Don't ask me about his method. He's way beyond me. It seemed logical when he presented it on the "TMRCA report" thread (Aug 2011) on DNA forums. I don't remember any arguments against his methods.

Here were all the markers where "timeframe for each locus where saturation effects are relatively insignificant" were greater than 5000 years. I don't use the multi-copy markers, even if he included them.

In my "36 linear" marker set I'm not using the ones at the bottom, like DYS572. I'm only using STRs with timeframes greater than 7000 years (to cover the Neolithic time.) As I've said, I don't use any multi-copy markers.

I ask you for some detail so I can modify my variance calculations and look at STRs you think are appropriate. I'm volunteering to do this for you. I don't really think this effort is going to lead to anything, but I'm willing to test your argument with real data like I have on Marko's "linear markers" or Ken's idea of "more markers is better except multi-copy, etc."

What tetra STR markers out of FTDNA's first 67 should be eliminated? Please provide the list. It should be easy to run a couple of comparisons. Maybe this will line up with Marko Heinila's linear duration analysis in which case the "36 linear" markers that I use will be appropriate.

Below is your answer. My request to help you is simple but you are not helping me help you.

The set you probably should use depends on the time frame of interest. This was Busbys observation, but not practice if I read his paper correctly. Its a probability issue. For independent events, as mutations are, the probability of two mutations at a loci is equal to the P(1) mutation squared. I don't have a good rule for picking, I observe, whatever their rates are, that CDYa,b can have more than one mutations per entry in a relative short time, hundreds of years. Maybe you can scale from their rate to estimate which dys loci have a low probability of two mutations in 1K years and so on?

When I say bounded I mean that (excepting multisteps), the mutational process at a dys loci is bounded/confined to modal +/-1.

You don't have to agree with the results, but please provide specifics on your argument so it can be tested in some manner.

I think we've gone over this, but CDYa,b are multi-copy markers and no one that I know of uses them in TMRCA calculations. They are already excluded from the argument. I exclude DYS385, YCAII, DYS464, DYS459, DYS413, DYS395s1, DYS425 (possible null), DYS439 (possible null) in any of my STR variance calculations. I do include those on straight GD calculations using modified infinite allele techniques.

I have played with adding and subtracting STRs and comparing relative variance across haplogroup. I've done this more systematically with the linearity estimates Marko Heinila has provided. I can tell you, it doesn't make much difference as long as you get enough STRs (individual experiments) going. The benefits of the law of large numbers seems to apply.

I am not going to extra research and gyrations unless you can be specific on what you want to test and do your own homework. Do you want to improve the processes? or you just don't like the answers?

I can't answer many of your queries. I think it is important first to agree, or disagree, on my premise that many of the dys loci (medium rate) are limited/bounded. I've provided a dataset that suggests they are, but I think we need more data.

A prior paper by goldstein, referenced in busby, gives a linearity equation. Thats what busby used. I don't know what range of values for each STR Markko used. If he didn't recognize the problem with multisteps, I would question his definition of linearity.

I'm not asking you to run any test cases yet since I don't know how to specify what you are asking. If someone who is much cleverer with S/W than I am could create some distribution tables, then we can evaluate that data and determine the next step.

I know Kens opinion of Zhiv. That said, a lot of folks, as you know, who are knowledgeable are supportive of his approach. What I'm trying to do is to come up with an understanding of why he had to fudge the data sets referenced by Chandler. I don't think we are chasing ghosts here.

I appreciate all the attention you've paid to my comments. I am limited in what guidance I can provide.

Here were all the markers where "timeframe for each locus where saturation effects are relatively insignificant" were greater than 5000 years. I don't use the multi-copy markers, even if he included them.

In my "36 linear" marker set I'm not using the ones at the bottom, like DYS572. I'm only using STRs with timeframes greater than 7000 years (to cover the Neolithic time.) As I've said, I don't use any multi-copy markers.

Perhaps it would be good to know what methodology he used, because he gets a linearity that is three and four folds greater than the previously observed linearity based on the Busby et al(2011) study.

There are some STRs such as DYS439, DYS635, DYS456, DYS389I, DYS389II, DYS458, Y-GATA-H4 that I couldn’t find above. Others such as DYS448 do not differ by much(i.e. Busby et al. 25381 ybp vs.35579), and DYS393 which gets 5648 ybp in Busby et al. vs. 9512 ybp(Above). The exception would be DYS390 which gets 9211 ybp Busby et al. vs. 7178 ybp(Above). The main point here is that out of 7 STRs that overlap in both cases, 6 have their linearity inflated, what’s worse is that STRs such as DYS437, DYS19, DYS393 which are being used as “most linear” because they have a linearity of more than 7000 ybp, actually show a linearity that is well below 7000 ybp.

Again I don’t know how that person came about those numbers, I know how Busby et al. came about their numbers, which was based on the observed range of alleles in each loci, and the mutation rates measured in father-son’s pairs.

.... I don't know what range of values for each STR Markko used. If he didn't recognize the problem with multisteps, I would question his definition of linearity. ...

You were the one who referred me to Marko Heinila. I was not familiar with him until you put me in contact with him. He definitely recognizes and tries to account for back-mutations and multi-step mutations. It is my understanding that is why he chose to use the "maximun likelihood" method.

His definition of linearity is very clear.

Quote from: Marko Heinila

timeframe for each locus where saturation effects are relatively insignificant

I am beginning to think you won't accept anything that does not fit your preset theories on Doggerland or on various clans. If that is the actual basis for your disagreement, that is fine - just say so.