“The comparison of IQ scores of different nationalities is, at best, a hazardous enterprise, to be undertaken with caution and humility, and at worst, a nonsensical and mischievous waste of time” - N. J. Mackintosh.

From time to time writers on GNXP and elsewhere compare the IQ of populations in different countries, even those with widely different economic and cultural circumstances. As a notable example, Richard Lynn has produced a table setting out (alleged) comparative IQs for dozens of countries.

In comments on previous posts I have questioned these comparisons, but it may be useful to offer a more systematic critique for discussion. My main point is that differences in average IQ scores between developed and underdeveloped countries cannot be taken as reliably indicating any difference in innate capacity.

First, there are great technical difficulties in finding tests that are suitable for international comparisons. This should be obvious enough in the case of tests with a large verbal content, but is also true of non-verbal tests such as Raven’s Matrices. Such tests are highly artificial, and bear no resemblance to anything that the testees encounter in their daily life. In attempting to test illiterate or uneducated people (whether children or adults), it is often difficult to get them to see the point of the questions (Mackintosh, p.181-2, gives some amusing examples). Philip Vernon, who had huge experience of testing around the world, pointed out that sometimes ‘a sizeable proportion of the testees turn out to be ‘non-starters’... They are willing to try, and we have reason to believe that the test discriminates down to their level of ability, yet they just don’t get the hang of what they are supposed to do’ (Vernon, [1], p.101). Testees in this category either fail to answer the questions at all (in which case their zero scores will drag down the average), or they make random or near-random guesses. The latter is more dangerous, because in tests with multiple-choice format, and a scoring system that does not penalise wrong answers, even random guessing may produce an apparent IQ of 50 or 60. Jensen ([3], p.416), has pointed out that some African test results may be invalid for this reason. It is also well-known that ‘unsophisticated’ testees improve their scores substantially with greater test-familiarity. Vernon ([2], p24), concludes that ‘children (or adults) who are sophisticated or trained in tackling multiple-choice items, following instructions, and working at speed have on average about a 10-point advantage... over those who are unfamiliar with objective tests’.

However, even assuming that the technical difficulties can be overcome, my main point is that even a large difference (say, 20 points) in mean IQ between a developed country like the USA, and an underdeveloped country, cannot safely be taken as evidence of an innate, genetic difference.

Even the staunchest hereditarians do not deny that environment has some influence on IQ. Estimates of heritability (the proportion of total variance attributable to genetic factors) within modern western populations vary widely according to the methods used. In his recent survey Jensen ([3], p.446) concludes that reputable studies give estimates ‘that range mostly between .40 and .60 for children and adolescents, and between .60 and .80 for adults’. The evidence for higher heritability among adults is debatable (see Mackintosh, p.92-3), but for the sake of argument let us assume that after allowing for random error variance (correcting for attenuation) heritability is as high as .80, leaving environment to account for the remaining .20 of the variance. This still leaves room for a large influence of environment - larger than the figure of .20 may suggest to the unwary. If we conceptualise the quality of environment (QE) as a single continuous variable (which is admittedly somewhat artificial, but no more so than the IQ scale itself), the correlation between IQ and QE will be root-.20, or about .45. (See Jensen, [1], p.401, or [2], p.137.) This means that for every difference of one standard deviation in quality of environment, we will expect to find a difference of nearly 7 IQ points (assuming an IQ s.d. of 15 points), on purely environmental grounds. Thus, if we take a ‘very bad’ environment as 2 sigmas below the mean, and a ‘very good’ environment as 2 sigmas above the mean, then we will expect someone of average genetic quality, but a very bad environment, to have an IQ about 13 points below the population mean, and about 26 points below someone of equal genetic quality in a very good environment.

Of course, heritability estimates derived from one population cannot be automatically extended to different populations, or to a combination or more than one population. However, as Jensen has cogently argued, it is possible to draw plausible inferences from within-group to between-group heritability (e.g. Jensen [2] p.133-148). It seems reasonable to suppose that whatever elements enter into ‘quality of environment’, the QE in underdeveloped countries (e.g. Africa or the illiterate rural populations of many countries in Asia and Latin America) is usually at least 2 sigmas below the mean of modern western countries. Certainly this would be the case if we take such indicators as income per head, life expectancy, infant mortality, nutrition, parasite infection, or parental education and literacy. As Jensen himself has said ([3], p. 460), ‘on a scale of environmental quality with respect to mental development, these adverse environmental conditions [in third-world countries] probably fall more than 2 sigma below the average environment experienced by the majority of whites and very many blacks in America’. On this basis, we would expect average IQ in a third-world country to be at least 13 points below the western average, even if the genetic endowment is equal. And of course this is assuming a high heritability estimate of .80. If we take a more modest estimate of .60 (which is more appropriate for children or adolescents, who are the usual test subjects in these comparisons), then the correlation between IQ and QE would be root-.40, or approximately .63. This would give an expected IQ difference of about 19 points for 2 sigmas difference in QE, which is close to the difference actually observed between western and most third-world populations.

If this seems a very tortuous argument, there is a simpler and more direct way of appreciating the possible effect of environment. It is well-known that average IQ in most western countries has been increasing at a rate of at least 3 points per decade for at least 5 decades (the so-called Flynn Effect). The cumulative increase since the 1930s is probably around 20 IQ points. It is also generally agreed that the causes underlying the increase are mainly environmental (I’m aware that increased heterosis may also have contributed, but this can’t have been a major factor). It also seems reasonable to suppose that environmental conditions in the US or Europe in the 1930s were already at least as good as in many third-world countries today. It follows that a difference of about 20 points can be accounted for by environmental circumstances.

Of course, none of this implies that differences in IQ (if they are genuine indicators of relative intelligence, and not just the results of inappropriate tests), are unimportant. If people are stupid, this has implications for politics, economics, and society, regardless of why they are stupid. But if the causes are primarily environmental, this does at least raise the hope that economic growth and wider education will improve the situation.

I have little to argue with here. I am not aware of a third-world ethnic group that does not go up in IQ when assimilated into the West.

Testing between nations carries all of the same potential problems as testing different ethnicities within a single nation. Which also means that even extremely low scores can be valid, reliable, and generalizable, w/o necessarily making a statement of genetic limits (second comment. My view on the Flynn Effect has changed since then.)

The value of data like Lynn's and other researchers' depends on the construct validity of IQ testing a given population. Unlike IQ data of ethnic groups in the U.S. a lot of this third-world data, like you point out, just doesn't have high certainty about the construct validity.

I haven't read IQ and the Wealth of Nations yet, but I would be interested in seeing how the authors deal with points such as this.

It seems to me we have one really big control on this question of environment: large numbers of people have emigrated from Third World to First World countries. Their group IQs in the First World in successive generations ought to be a useful source of info to use to get at environmental influences.

Yes, this is a good point. As Jason mentioned, IQ scores of immigrants from 3rd-world countries tend to go up - see also my post on 'Dutch Treat'. But the scores don't always reach the average level of the 'host' population, so the argument is then about whether this reflects a genetic deficit or an environmental disadvantage.

Steve: I'm not a psychometrician, but looking at the handbook for the Wechsler Adult Intelligence Scale - one of the most widely used - it has 11 components: Information, Comprehension, Arithmetic, Similarities, Digit Span, Vocabulary, Digit Symbol, Picture Completion, Block Design, Picture Arrangement, and Object Assembly. But these don't necessarily identify fundamentally different types of ability, and psychometricians use factor analysis to find more basic sources of performance, such as speed of information processing, verbal fluency, spatial perception, or short-term memory. There are umpteen different schools of factor analysis - Mackintosh's book has a good summary.

I think David is onto something important with his point about how it can be hard to get 3rd Worlders to take IQ tests (or more generally, abstract thinking) seriously. Thomas Sowell made the same point, in his 1994 book Culture and Race. Still, Sowell goes on to say that if some 17 years olds don't understand the point of thinking abstractly by that age, they probably aren't going to learn.

In general, it's very hard to explain the one standard deviation gap between African-Americans and Africans on a purely genetic basis (since African-Americans are only 17-18% white, although enforced heterosis might help).

Still, the sheer consistency of the data in the Lynn and Vanhanen's book is a huge stumbling block for those trying to dismiss it. I calculated an r of .93 for all the occurrences of multiple tests from the same country. These are typically taken a decade or more apart, with different samples, given by different investigaors, and often with different IQ tests. Occasionally, the scores diverge significantly (Poland's two reported scores are 92 and 106), but typically they are remarkably consistent. They are also relatively consistent across national boundaries but within regions.

Also, differences in IQ scores among immigrants vs. stay-at-homes generally have more to do with how immigrants are selected -- e.g., Indian immigrants score much higher than Indians, but then there is an obvious brain drain going on from India to America. In contrast, Mexican-American IQs seem to be quite similar to Mexican national IQs. Most Mexican immigrants come from the working and peasant classes, while the well-educated stay home. So, I'm not aware of immigration providing strong evidence for a Flynn Effect in operation.

Steve: the 'sheer consistency' of L&V's data might just mean they have selected the data that fits! Lynn is notorious for picking and choosing - see the debate a few years ago in J. Biosocial Science over his views on sex differences in IQ. (I'm not saying he is worse than people like Kamin, but I don't know that he is better.)

This is a classic blog entry. I myself question exactly how accurately one may draw inferences about a poor nation's IQ -- there are so many environmental factors which may muddle the degree to which a measured IQ is "innate" to a nation. This is a topic worthy of more discussion, clearly.

the 'sheer consistency' of L&V's data might just mean they have selected the data that fits!

Hmm. I'm not sure that they exclude data. I accept (and have argued) that there is a consistency to this data which probably commands attention.*

I think rejecting it wholesale is a mistake. Instead, my confidence will grow as more research is done to establish the construct validity of some of this third world data (basically what Jensen did in 'Bias in Mental Testing' to show that minorities could reliably be tested in the US).

there are so many environmental factors which may muddle the degree to which a measured IQ is "innate" to a nation.

Brahmin, I'm not sure Lynn and Vanhanen were arguing this exactly (and innateness is inferred from lines of converging data, not just a score). I think that they said scores had promise of improving in many nations if greater concern was applied to pre-natal conditions.

All in all, I think what Dave is saying is important, but I think we will view this kind of work differently because we have different opinions on the probability of racial trait variations. Once you accept the premises I do, work like Lynn's seems much more vital, explanatory, and important, even despite its relative crudeness. I have trouble believing that Lynn, who at least takes Darwinism seriously, can be compared to Kamin, who is still, beyond all reason, arguing that the heritability of intelligence is 0%.

*not only of the kind Steve mentions, but in how many of the same populations score in first and third world conditions.

I confess I haven't read L&V's book - it is not published in the UK, no library that I use has a copy, and I am not going to spend about $70 getting the US edition from Amazon. So I can't say whether or not the data they use in the book is 'selective' - just that Lynn, rightly or wrongly, has been accused of selectivity in some of his earlier work.

Also I should make it clear that I don't deny that there are IQ differences between nations, and I don't deny that they could be important (see the last para of my post). It is also quite possible (and on a priori grounds quite likely) that there are some genetic differences in IQ between populations that have evolved in partial isolation from each other and under different selective conditions. There are genetic differences in everything else, so why not IQ?

All I say is that:

(a) it is technically difficult to compare IQ in different countries, so the figures may not be robust, and

(b) even on a strongly hereditarian view of IQ, such as Jensen's, the level of environmental differences between developed and 3rd-world countries would be expected to produce a substantial IQ deficit. I don't think anyone has yet challenged my argument on this point.