Reading Tests That Don’t Actually Measure Reading

To make them easier to find, several items originally included in the now-defunct ‘This Week in Language Teaching’ series will be reposted over the next few weeks as separate entries.

Hua and Keenan (2017, paywall), following up on early work by Keenan and her colleagues, examined five popular reading comprehension tests to see how much of the variance in scores could be explained by a reader’s listening comprehension, and how much by his or her ability to read individual words in isolation (word recognition).

Hua and Keenan tested a large group of students (N = 834) ages 8 to 18. They used everyone’s current favorite for statistical analysis, quantile regression, to see how readers of different abilities fared. I report here the results for the “average” student:

There’s a clear difference between tests #1 and #2 and #3-5: the PIAT and Woodcock-Johson Passage Comprehension (WJPC) are mostly word recognition tests rather than measures of comprehension.

Why the difference? In the PIAT and WJPC, readers read single sentences and/or complete a “cloze” (fill in the missing word). There is little context to aid the reader, nor is there a very high level of complex comprehension required. Tests #3, 4, and 5, however, involve reading longer passages and answering questions or retelling the events of the story.

In other words, Tests 3-5 measure more of what most people (teachers, the general public) consider “reading comprehension” – understanding what you read. Word recognition, on the other hand, is often used as a proxy for decoding skills (the ability to convert letters into sounds), and not paradoxically is also much more easily influenced by phonics instruction.

Why is this important for us to know? At least three reasons:

The authors of The National Reading Panel’s Report (among others) have claimed that phonics instruction improves “reading comprehension” for Kindergartners and first-graders. But, as Elaine Garan rightly pointed out, the NRP conclusions relied on studies that used tests like WJPC, so the evidence actually only indicates that phonics helps word recognition. You can improve word recognition without improving reading comprehension, and indeed both experimental and observational longitudinal studies of the effects of early phonics instruction show no improvements to actual reading comprehension at all in later grades (Sonnenschein et al., 2010; Torgesen et al., 1999; Torgesen et al, 2011).

Schools often rely on WJPC and similar tests in “diagnosing” dyslexia. Children with perfectly normal reading comprehension may thus be wrongly mislabeled as dyslexic.

Researchers examining the supposed genetic and neurological basis of dyslexia and reading disabilities also use tests like #1 and #2, and thus may be getting results (again) that have nothing to do with reading comprehension. In fact, Keenan and her colleagues showed precisely that (Keenen at al. 2006): depending on which reading test you used, you could identify a completely different set of genes as being related to reading.

The researchers speculate that their results for the QRI-Questions measure is probably due to ceiling effects on the QRI-Questions for good readers – the readers all clustered at the top, so there was little variance among them. This doesn’t alter the larger point we’re making here, which is that listening comprehension is much more important than word recognition in predicting (real) reading comprehension scores.