When groups means are compared in a pairwise fashion,
the results may turn out in such a fashion as to be at odds with "common
sense."

In your math and geometry classes, you learned that
if A is larger than B, and B is larger than C, then it follows that A
is larger than C. You also learned that if A is equal to B, and B is equal
to C, then A must be equal to C. While the first of these statements holds
true for the results that pop out of any set of pairwise comparisons,
the second statement may not.

Consider Table 3 in Excerpt 13.20. (It appears at the
top of page 366.) This table presents the means for four groups that were
involved in a post hoc investigation. Using Tukey's HSD procedure, each
of these four means was compared against each of the other three means.
Note carefully the outcome of the pairwise comparisons involving the first
three groups.

As you can see (based on the subscripts attached to
the means and note beneath the table), the Hi-Hi group was not significantly
different from the Hi-Lo group. Also, the Hi-Lo group was not significantly
different from the Lo-Hi group. Despite those findings, the Hi-Hi group
was significantly different from the Lo-Hi group. This
set of finding, in the minds of many people who are learning about statistics,
seems at odds with common sense. Perhaps you too might say, "If the
1st group is not different from the 2nd group, and if the 2nd group is
not different from the 3rd group, how can the 1st group be different from
the 3rd group?"

The solution to this apparent paradox requires that
you do two things. First, remember that pairwise comparisons are focused
on population means, not sample means. Second, remember that it's wrong
to think that the null hypothesis is true simply because Ho has not been
rejected.

In Excerpt 13.20, the fact that 83.98 and 79.20 were
found not to be significantly different should not cause us to think that
they (or their corresponding population means) are equal. All we can legitimately
conclude is that the sample evidence does not permit us to conclude that
mHi-Hi and mHi-Lo
are different. The same holds true for the result obtained when 79.20
was compared with 78.22. All we can legitimately conclude is that the
sample evidence does not permit us to conclude that mHi-Lo
and mLo-Hi are different.

Even though the differences between 1st and 2nd groups
and between the 2nd and 3rd groups were "too close to call,"
the difference between the 1st and 3rd groups was sufficiently large to
suggest that mHi-Hi and mLo-Hi
differed.