Thursday, January 30, 2014

Basic Econometrics: Tiptoe through the Type II Tulips

An ancient bit of American folk wit asks, "If you call a tail a leg, how many legs does a cow (dog, horse, pig) have?"

The answer is four. Calling a tail a leg doesn't make it one. No amount of regression analysis virtuosity is going to "predict" or "estimate" that phantom fifth leg at a statistically significant level.

So what I'm going to say about Munnell and Wu's analysis and Type II errors may seem redundant. And if it was just Munnell and Wu's report that was at stake, it would be. But, as I pointed out in my earlier post, I'm looking at a minor genre of do-older-workers-take-jobs-from-the-young lump-of-labor boilerplate. This crap proliferates like the heads of Hydra, cut one off and it grows two more. So think of what follows as "cauterizing the stumps."

Type I versus Type II errors
In hypothesis testing, a Type I error is made when the null hypothesis is rejected although it is true. A Type II error is made when the null hypothesis is accepted but is actually false. This gets rather convoluted in that the null hypothesis posits "no effect", rejecting the null hypothesis implies probably finding an effect and thus the Type I error would find an apparent effect that doesn't actually exist. A Type II error would find no effect even though there is one.

"Statistical significance" refers exclusively to a low probability of committing a Type I error. It has no bearing either on the size of the effect or on the probability of making a Type II error.

Munnell and Wu reported the following finding with regard to the "crowding out" effect:

If crowding out were occurring, an increase in older persons’ employment would increase youth unemployment. However, the coefficient is negative and statistically insignificant, that is the increase in the employment rate for older people has no impact on youth unemployment. The second column presents the results for youth employment. Again, no sign of crowd out is evident. Instead, a 1-percentage-point increase in the employment rate for older people is associated with a 0.07 percentage points increase in youth employment. This finding strongly contradicts the crowd-out hypothesis.

How "strongly" does this finding contradict the crowd-out hypothesis? That depends on the statistical power of the test, which Munnell and Wu didn't report on and perhaps didn't even consider. In The Cult of Statistical Significance, Stephen Ziliak and Deirdre McCloskey reported the findings from their surveys comparing how statistical significance testing was used in papers published in the American Economic Review in the 1980s and the 1990s. Only 4.4% of papers published in the 1980s considered the statistical power of the test. By the 1990s, practice had improved somewhat -- to 8%!

Baroudi and Orlikowski give a succinct summary of the relationship between statistical power, Type II error and the reporting of statistically non-significant results:

Statistical power becomes particularly crucial to the interpretation of results in those cases where the null hypothesis is false; that is, when the phenomenon being investigated does exist. If the test reveals non-significant results in these circumstances, the usual response is to accept the null hypothesis and conclude that the effect being examined does not exist. Such a conclusion, however, would not be appropriate if the phenomenon actually exists but was undetected because the statistical test was not powerful enough. In such a case, a conclusion of "no effect" would be misleading; we would be generating a spuriously negative result - committing a Type II error.

Was Munnell and Wu's statistical test powerful enough to conclude, as they did, that their statistically insignificant finding "strongly contradicts the crowd-out hypothesis"? They don't say. But a cursory glance at a scatter-plot of the 55-64 year old employment rate and the 20-24 year old unemployment rate strongly suggests it wouldn't matter anyway. There's nothing to see there. As Thomas Pynchon wrote in Gravity's Rainbow, "'If they can get you asking the wrong questions, they don't have to worry about answers."

The crowding-out hypothesis is a red-herring. Plain and simple. This pseudo-econometric farce is about fabricating a rationale for raising the pension-eligibility age.