"[...] Given the fact that the nil hypothesis is always false, the rate of Type-I errors is 0%, not 5%, and [...] only Type-II errors can be made, which run typically at about 50% [...] [T]ypically, the sample effect size necessary for significance is notably larger than the actual population effect size and [...] the average of the statistically significant effects is much larger than the actual effect size. The result is that people who do focus on effect sizes end up with a substantial positive bias in their effect size estimation. Furthermore, there is the irony that the "sophisticates" who use procedures to adjust their alpha error for multiple tests (using Bonferroni, Newman-Keuls, etc.) are adjusting for a nonexistent alpha error, thus reduce their power, and, if lucky enough to get a significant result, only end up grossly overestimating the population effect size!"

J.Cohen, "The Earth is round (p<0.05)".

A short discussion is maybe needed.

The text above refers to null hypothesis significance tests in psychology. The author explains that when one tests for a correlation between two variables -take them to be number of nobel prize recipients of a country and chocolate consumption for example- the nil hypothesis is that the correlation is exactly zero, while in fact a small, maybe minuscle, but nonzero correlation is bound to exist anyway. So the type-I error rate -the rate at which you mistakenly find a significant correlation with your test- is strictly zero, not 5% (if you do a 95% confidence level test).

The other part of the text is highly amusing to me, because it well applies to particle physics. While in particle physics the null hypothesis is usually true, when it is in fact false a significant bias is usually present in the size of the cross section that is measured for a new state. That is nothing else but the bias that is discussed above. Finally the "Bonferroni correction" mentioned in the text is the correction for the "trials factor", aka look-elsewhere effect, which accounts for the multiplicity of places where you looked for an effect. In particle physics this is strictly needed, but in studying tiny correlations and given that the nil hypothesis is strictly false, the type-I error rate is zero, so one does not need to correct at all ! Fascinating stuff.

Comments

You can say exactly the same for Bayesian priors. People don't read Bayes much these days, still less appreciate his Theorem, which gives the inverse relation between the strength of the hypothesis given the evidence, and the power of the evidence given the hypothesis. That's right, the only rational Bayesian prior is just the Type II error....

It would help a bit if we could wake up to the existence of Inverse Problems like this, but that means dethroning Newton the Hero, because he didn't solve the inverse problem in gravitation.

Cohen's statement shows a common fallacy. In reality, statistical significance, and hence incidence of Type I or Type II errors, has nothing to do with the truth or falsity of the null hypothesis (or of the research hypothesis). Rejecting the null at the 95% level (for instance) only means that if you repeated the experiment 100 times with 100 independent samples, you would expect to reject the null hypothesis approximately 95 times.Despite what beginning research students often think, no statistic in a test can be interpreted as "the probability that the null hypothesis is true." The truth/falsity of the hypothesis (to a certain number of significant digits) is not a matter of probability. At least in the macroscopic world, the hypothesis either is true or isn't.

I never quite know if making this distinction is important when teaching undergraduates (that 'it's the probability of a type I error' vs 'it's the proportion of times you expect to incorrectly reject the null hypothesis'), of if it just adds to the confusion.