Krashen and Mason (2017) have an important new study summarizing the results of three separate meta-analyses on sustained silent reading (SSR) and extensive reading (ER) for second language acquirers. The news is good: all three analyses found very strong effects on reading comprehension and vocabulary for SSR and ER programs. Get the study for free here (PDF) while supplies last.

Krashen and Mason provide a handy table with the effect sizes from the three meta-analyses (studies that summarize the results of several experiments). Using effect sizes allows us to compare the results of different studies more easily. They can be computed in different ways, but are most often reported as the number of standard deviations that separate two groups at the end of the experiment, or, for studies with no control group, the difference between the pretest and post-test scores.

What is a “Large” Effect?

To put things in more concrete terms, an effect size of .10 would move a student who began an experiment at the 50th percentile up to the 54th percentile, an increase of 4 percentile points. Most teachers would not (rightly) be too impressed by this. An effect size of .50, however, would move that same student up to the 69th percentile, a 19 percentile point gain.

Effect sizes are generally considered small when they are .20 or less, medium when they’re around .50, and large if they are .80 or more. Although this is a rather rough guide, when you examine a lot of meta-analyses across the behavioral sciences (psychology, education, etc.), the distribution of effect sizes is very close to this rule of thumb (Lipsey & Wilson, 1993).

Some researchers (e.g. Bloom, Hill, Black, and Lipsey, 2008) argue, however, that you can only really compare “apples to apples,” so that effect sizes for reading studies should only be compared to other reading studies when determining whether an effect size is “large” or not.

Comparing SSR/ER to Other “Reading Interventions”

If we follow Bloom et al.’s suggestion, we find that silent reading compares very favorably with other reading “interventions.”

Swanson et al. included both first- and second-language studies (37 publications total), and limited their review to studies published between 2000 and 2015. Krashen and Mason’s summary included meta-analyses of studies on a range of students, from elementary school to college, and only in the second-language (mostly English). But a fair chunk of those studies were in K-12 settings. Krashen’s (2007) meta-analysis included only studies for “older readers and young adults.” About 40% of the comparisons used by Jeon and Day (2016) and 35% of Nakinishi’s (2015) studies included children and adolescents (there was an overlap in studies across the three meta-analyses).

Swanson and her colleagues did not give detailed descriptions of what the teachers did in the studies they examined. From looking at the titles of the studies, it isn’t clear if they included any SSR/ER comparisons (I’d guess “no”).

Somewhat strangely, Swanson et al. reported only the total effect size for each study/comparison, adding together scores from the comprehension, vocabulary, and fluency tests. They reported their results by intervention type (“reading comprehension” and “reading comprehension plus vocabulary”).

And the Winner Is . . . Silent Reading

In Table 1, I report the results of the three meta-analyses discussed by Krashen and Mason, as well as the one by Swanson et al. Listed are the effect sizes for reading comprehension, vocabulary, and all the “reading” measures combined (which may include things like “fluency” or reading rate). I’ve put both Swanson et al.’s standardized measures and “unstandardized” ones (tests created by the researchers for that particular study). Usually the impact of an experiment is greater on tests created by the researchers than on ones that are normed on a larger population (such as the Gates-MacGinitie or other state achievement tests).

Note: Vocabulary and reading comprehension are for all age groups. ^=Includes only 3 comparisons

Overall, the effect sizes found for the SSR/ER studies are much larger than for those analyzed by Swanson et al. Effect sizes for reading comprehension in particular are up to three times larger for SSR/ER than for other treatments.

Jeon and Day reported separate effect sizes by age, and found the effect sizes were as good as or better than Swanson et al.’s interventions (“children”: 0.52 (6 comparisons), adolescents: 0.35 (15 comparisons) (control-group studies only)). Krashen’s meta-analysis only included high school and college students, and found large effects on reading comprehension. Nakanishi reported mixed results (.05 for junior high, .57 for high school), but he had only three comparisons each for his control-group studies, so his estimates are less reliable than the results reported by Krashen and Jeon and Day.

Tie Goes to the Reader

As Krashen (2004) pointed out, a “tie” between SSR/ER and other interventions is actually a win for silent reading: SSR/ER is much easier for the teacher, and far more enjoyable for the student, than any of the alternatives on offer.

The results of the SSR/ER studies should at least give schools pause before putting money into some new “reading intervention” program. It seems that just giving kids books and time to read them is the best “Tier 1” intervention of all.

*But wait – there are even more “tiers” to confuse you! A few years ago some vocabulary researchers decided to call technical vocabulary (words used primarily in one particular field, such as protoplasm in biology) “Tier 3” words. And what we used to call sub-technical vocabulary (words used in many different scientific or academic fields, such as generation and hypothesis) was renamed “Tier 2” vocabulary. General vocabulary then became “Tier 1” words.