Rising Scores on Intelligence Tests

The Flynn Effect

IQ tests do not remain fixed forever. Most of the major ones have been updated from time to time. The 1949 WISC, for example, was superseded by the WISC-R in 1974 and by the WISC-III in 1991. The revised versions are standardized on new samples and scored with respect to those samples alone, so the only way to compare the difficulty of two versions of a test is to conduct a separate study in which the same subjects take both versions. Many such studies have been carried out, and James Flynn, a political scientist at the University of Otago in New Zealand, summarized their results in 1984. In virtually every instance, the subjects achieved higher scores on the older version of a test. For example, in David Wechsler's own study of his revised adult test—the Wechsler Adult Intelligence Scale-Revised (WAIS-R)—a group of subjects who averaged 103.8 on the new WAIS-R had a mean of 111.3 on the older WAIS. This implies that the actual IQ-test performance of adults rose by 7.5 points between 1953 (when the old WAIS was standardized) and 1978 (when the WAIS-R was standardized), which is a rate of about 0.3 IQ points per year.

These gains are not limited to the WAIS, to adults or to the United States. In an influential series of papers, Flynn showed that the increasing raw scores appear on every major test, in every age range and in every modern industrialized country. (The rise itself is now often called "the Flynn effect.") The increase has been continuous and roughly linear from the earliest days of testing to the present. On broad-spectrum tests such as the WISC and the WAIS, Americans have gained about 3 IQ points per decade, or 15 points over a 50-year period. It is interesting to compare this total with the much-discussed gap between the mean test scores of Caucasian and African Americans, which is also about 15 points (one standard deviation of the IQ distribution). Given that the IQ of the population as a whole has increased by a similar amount just since the 1940s, that gap does not seem so large.

The pattern of score increases for different types of tests is somewhat surprising. Because children attend school longer now and have become much more familiar with the testing of school-related material, one might expect the greatest gains to occur on such content-related tests as vocabulary, arithmetic or general information. Just the opposite is the case: "Crystallized" abilities such as these have experienced relatively small gains and even occasional declines over the years. The largest Flynn effects appear instead on highly g-loaded tests such as Raven's Progressive Matrices. This test is very popular in Europe; the Dutch data mentioned earlier came from a 40-item version of Raven's test. Using the 1952 mean to define a base of 100, Flynn has calculated average Dutch Raven IQs for subsequent years. The mean in 1982 was 121.10—a gain of 21 points in only 30 years, or about seven points per decade. Data from a dozen other countries show similar trends, which seem to be continuing into the 1990s. Whatever g may be, scores on tests that measure it best are going up at twice the rate of broad-spectrum tests like the WISC and WAIS, while the tests most closely linked to school content show the smallest gains of all.