2013 October 10

xkcd: Null Hypothesis

I have a number of my favorite xkcd comics on the bulletin board outside my office, including this one:

(Click on the image to go to the website to get the mouseover—I don’t have it on my bulletin board, but it is worth the effort of clicking.)

Today I had a student ask me to explain the joke—more precisely, to explain what “the null hypothesis” was. I did so, of course, explaining how the p-values that calculate how likely something is “by chance” need a formal definition of “chance”—the null model or null hypothesis. I even explained that all statistical tests do is to allow you to reject (or not reject) the null hypothesis—that they tell you nothing about the hypothesis you are actually testing.

Given the casual nature of the question, I did not go into detail about how important it is to choose or construct good null models—ones that contain all the explanations other than hypothesis you hope to test. I normally spend a full lecture on that in my bioinformatics course, as well as one of the weekly homework assignments, having the students program different null models for an open reading frame that result in very different p-values for a protein-coding gene detector.

Normally, I enjoy this sort of conversation with students—I like students who are curious and who are unafraid to ask questions to clear up things that confuse them. Today I was a little disturbed by the question, as the student had been in my office to get a signature on an approval form for a senior thesis in bioengineering. How had the student gotten that far in our program without having learned what a null hypothesis is? Where is the hole in our curriculum that allows that, and how can we fix it in the curriculum redesign this year?

I realize that no curriculum design can completely cure the cram-and-forget disease that infects many college students, but I did not get the impression that this was a student who had known it once but forgotten. Rather, I had the impression that the concept was a new one, though the name might have appeared before.

On looking over the bioengineering curriculum I see that students can take a probability course without any statistics course—perhaps that is what happened here. Unfortunately, biomolecular experimentalists have to be very familiar with null hypotheses and statistical tests, so I think we have patch the curriculum to make sure that all the students get statistics.