The GMAT is a graduate entrance test used by more than 5,900 business programs offered by more than 2,100 universities worldwide. While the test is given in English, it is designed to be as minimally English dependent as necessary to predict successful completion of Business programs taught in English. Further, the test is carefully scrutinized for item bias. Rudner (2012) explains:

Yes, the GMAT test is administered in English and is designed for programs that teach in English. But the required English skill level is much less than what students will need in the classroom. The exam requires just enough English to allow us to adequately and comprehensively assess Verbal reasoning, Quantitative reasoning and Integrated Reasoning skills….

We carefully review our questions using criteria defining good item construction. We also compute statistics to assess whether our questions are appropriate across culture groups. We constantly update guidelines for our item writers, including a master list of terms and phrases to avoid in order to assure cultural fairness. By using carefully defined and thorough item development and review processes, along with statistical analyses to flag questions with possible cultural bias, we have developed a test that minimizes the impact of culture and language. The GMAT exam is the best objective measure of the likelihood of success in management programs across the globe.

Despite the claimed lack of bias and apparent predictive validity of the test, there is substantial global variance in scores. Rudner (2012) attributes this variance largely to differences in native language spoken and to differences in self-selection.

We decided to explore to what extent global differences could be accounted for differences in National IQ. To do this, we examined the relation between measures of national cognitive ability, English language proficiency, English language usage, and GMAT scores by reported citizenship. We also sought to determine to what extent GMAT scores could be used to index the National IQs for poorly investigated regions such as North Korea, Rwanda, and St. Kitts.

Method

Using the 2001 to 2012 GMAT Profile reports, and basing our estimates on the test takers’ reported citizenship, we compiled n-weighted average GMAT scores for 192 countries and territories. For countries and territories that had fewer than 50 participants over the 12 year period which we mainly looked at, we supplemented the data with 1998 to 2001 scores. Those regions which still failed to reach 50 participants (e.g., Central Africa, N=12 per 15 years) were excluded from the analysis. Variables:

National Cognitive Ability: We used L&V’s (2012) Measured National IQs (LV2012), L&V’s (2012) Estimated National IQs (LV2012EST), and Altinok et al.’s (2013) National Achievement Estimates (Altinok2013ACH). L&V’s (2012) National IQs were computed based both on the results of IQ studies published between 1929 and 2011 and on the results of international achievement tests conducted mostly between 1990 and 2007; L&V’s (2012) Estimated National IQs include estimates inferred using the Measured National IQs and taking into account national ethnoracial composition and global location; in both cases, we used the reported Final IQs. Altinok et al.’s (2013) National Achievement Estimates were calculated based on international achievement surveys given from 1964 to 2011; we used the reported mean primary and secondary scores.

English Language Proficiency (EPIAverage): We used Education First’s 2009-2011 English Proficiency Index. 2009 and 2011 scores were averaged. These indexes were said to be based on English learner proficiency tests. All tests included grammar, vocabulary, reading, and listening sections.

English Usage: We used three indexes of English usage. Official language (EngOfficial): we coded countries and territories 0 or 1 if, respectively, the CIA World Factbook reported that English was not or was an official language. Crystal (2003) rates of English usage: we computed rates of English language usage based on the data reported by Crystal (2003) page 62-65. CrysalENGL1 stands for the percent of primary English speakers, CrysalENGL2 stands for the percent of secondary English speakers, CrystalL1L2 stands for the percent of primary and secondary users. Wikipedia English usage rates: with the exclusion of the P.R.C. (China) rates, we used wikipedia reported English usage rates; these were based on a number of data sources, which are cited in the page; we excluded the noted P.R.C. rates as they were implausibly low.

R-matrix: Based on 117 countries and territories, a 0.91 correlation was found between L&V’s (2012) Estimated National IQs and Altinok et al.’s (2013) National Achievement Estimates. The correlations between these two estimates of national cognitive ability and GMAT scores were, respectively, 0.74 (nations=173) and 0.74 (nations=125). EPI scores significantly correlated with both indexes of national cognitive ability and with GMAT scores. For the 56 countries that had all four measures, scores loaded on a common factor which explained 67% of the variance. English as an official language status was not positively correlated with any of the four cognitive measures. None of these measures of English Usage significantly correlated with GMAT scores.

Validity Check of L&V’s 2012 Estimated values: GMAT scores were correlated with L&V’s (2012) estimated values in exclusion of the measured values. The correlation was highly significant at 0.745 (nations=30) and was not significantly different from the correlation based on the measured values alone. Thus L&V’s (2012) estimation method is validated.

Caribbean GMAT scores: Caribbean GMAT scores are shown below. The Average scores was 444 compared to a US African American scores of 431, a White American scores of 539, a SubSaharan African scores of around 440, and a European score of around 550. (United States citizens are relatively unselected.)

Conclusion

We have shown that measures of national cognitive ability predict both GMAT scores and English proficiency scores independent of rates of English usage. GMAT and English proficiency scores seem to be indexes of National (g) by way of National IQ and National ACH.

These results have threefold relevance. Firstly, national cognitive ability explains much of the hitherto unexplained variance in GMAT and English proficiency scores. That this is the case appears to have been realized by neither the Graduate Management Admission Council nor Education First. The Graduate Management Admission Council has explained global variance in GMAT scores in terms of selectivity and native language usage; Education First has attributed English proficiency differences to differences in the strengths of national English curriculums.

Secondly, these results provide further support for the position that national IQs/ACHs are predictive on the inter-individual between national level. The predictive validity of GMAT scores has been well-established. The GMAT validity studies routinely tell us:

Results from the studies support the use of GMAT scores for predicting success in graduate business programs. The studies included support for using GMAT scores across different demographic groups including:
• Gender
• Race/ethnicity of US students
• Citizenship
• Native language

(From: Validity Study of Non-MBA Programs)

The lack of predictive bias with regards to Citizenship groups implies that national IQs are predictive on the level of the individual, not just that of the nation. These results are then in line with those of Jones (2008). Jones (2008) showed that National IQs predicted immigrant wages just as well as IQ tests predicted US native wages.

Thirdly, the GMAT scores provide National IQ estimates for poorly studied nations and territories, such as those in the Caribbean. Further, the results more or less corroborate Richard Lynn’s estimated National IQ values.

References

Altinok et al. (2013). A New International Database on Education Quality: 1965-2010.

13 Comments

The issue of test bias is a serious one. Given Schmitt & Dorans (1990) review, you see that most detected DIFs tend to be concentrated on language bias. In their series of studies, minorities selecting english “as their best language” do not respond differentially than the ethnic-majority group. The amount of DIFs seems large when the subject’s best language is not english, which could well be the case for some groups of immigrants. In this case, the test’s content is not the cause of bias.

So what’s the point ? Simply. When you perform multiple regression, controlling for english usage or countries where english is the official language, you somewhat control for DIF effects on these correlations and reduce the risk of non-comparability when different countries are involved. The non-comparability is an argument Wicherts has made, if I remember. Best would be if you publish that stuff, maybe here, focusing on the argument I made, because I believe it’s a valid defense on national IQs. That would help many researchers by encouraging them to add more discussion on their robustness checks.

Note North Korea. It is relevant to debates concerning both the validity of a global hereditarian viewpoint and the extent of “reverse causality” (IQ may affect the wealth of nations, but might the wealth of nations affect IQ?). If we can now put added faith in L & V’s estimates, it seems that North Korea’s average IQ is largely explained by genetic factors, AND there is not much reverse causality. North Korea may have had starvation and living standards comparable to West Africa’s, but its average IQ is much closer to South Korea’s than it is to Ghana’s.

Correction: After looking up national IQ estimates from Intelligence: A Unifying Construct for the Social Sciences, it looks like L&V inferred North Korean IQ exclusively from the score of South Koreans without taking into account Chinese scores at all.

As far as I can tell, Korean numeracy rates were determined exclusively from the records of one county named “Dansung”, which I believe was in the historical Gyeongsang district, which in modern times is part of South Korea. If Hojuk data were available for multiple regions, it might be possible to make a comparison. Unfortunately I am not well-acquainted with the nature of the Korean historical record, nor am I familiar with the Korean language for that matter.

Jason Malloy said:I do have intelligence test data for North Korea, but the (fascinating) reference is older and requires a laborious translation from Japanese.

Isn’t Richard Lynn fluent in Japanese? I haven’t read it, but I know he wrote a book about Educational Achievement in Japan, and I’d imagine any systematic treatment of the subject would require competency in the language. Since IQ research is his bread and butter, I’d guess he might be interested in translating it.

I have some Japanese students who might be interested in doing the translation. I may be able to get some funds to help compensate them. If you can e-mail me the study, they can give me an estimate as to how much labor it would take to get the translation done.