Hypothesis Testing (Part 1)

Hypothesis testing represents a very important part of Statistics, and it is usually misunderstood in terms of the the objectives and methodology. First of all, let me tell first what hypothesis testing is (and then I'll tell you what is not):

Hypothesis testing corresponds to a statistical technique which aims to evaluate a statement about a certain population parameter

For instance, say you're studying the height of the students at your local community college. In particular, you are interested in saying something about the mean height \(\mu\) (measured in feet) of the population of students. Based on your previous research, or even your gut feelings, you may be convinced that the mean is \(\mu = 5.9\). In order to evaluate your claim, we can use hypothesis testing. (Be aware that hypothesis testing is not the only way you can assess a claim about a population parameter)

Now, I'll tell you what hypothesis testing is not about:

- A way of estimating a parameter. (For estimating parameters there is a whole branch call Inferential Statistics)

- A way of saying something categorically about a population parameter (Not the case. In hypothesis testing there is always the possibility of errors. Sorry, no crystal balls here.

Null and Alternative Hypothesis

There is a systematic way to approach to hypothesis testing. The philosophy is very simple:

(1) You make a claim about a population parameter

(2) Data are collected from the population is the form of a random sample, in such a way that the data collected is "representative" of the whole population.

(3) Analyze the results of the sample (you get the sample mean, sample standard deviation, etc) and compile a neat table (not necessary but useful)

(4) Finally, the million dollar question: Do the results from the sample seem to support what I'm claiming about the parameter??. If the results are completely off-line with what we are claiming, that indicates that may have to review our claim, or maybe even reject our claim. On the other hand, if the results of your sample are in tune with your claim, you may simply say: "It seems that my claim is correct, but I couldn't really assure that it's true"

That's it. Those are the main principles. The rest are just accessories. Of course all this requires a mathematical framework. In fact, we need to establish when can you say that your claim is "not in tune with the results of the sample".

Example: Say that you claim that population mean height of students at your college is \(\mu = 5.6\). Diligently, you obtain a random sample of 100 students, and you find that the sample mean is \(\overline{X} = 6.3\) (were they all basketball players, uh?). What do you think, do you think that the sample data supports your claim?

Well, it seems not. In fact, we know that the sample mean \(\overline{X}\) is a good estimate of the real population mean \(\mu\), especially if the sample size is large, like in this case. So, it would be reasonable to expect the true value of \(\mu\) to be around 6.3 (not exactly, but around). Considering all this, a claim that states that \(\mu = 5.6\) doesn't seem to be supported by the evidence.