Search form

What intelligence tests miss

Keith E. Stanovich and Richard F. West on why we need rationality quotient (RQ) tests as well as IQ tests

It is a profound historical irony of the behavioural sciences that the Nobel Prize was awarded for studies of cognitive characteristics (rational thinking skills) that are entirely missing from the most well-known mental assessment device in the behavioral sciences – the intelligence test. Intelligence tests measure important things, but not these – they do not assess the extent of rational thought. This might not be such an omission if it were the case that intelligence was an exceptionally strong predictor of rational thinking. However, research has found that it is a moderate predictor at best and that some rational thinking skills can be quite dissociated from intelligence.

In psychology and among the lay public alike, high scores on intelligence tests are considered a mark of good thinking. This is not unreasonable. It is now over 100 years since Spearman first reported a single general intelligence factor, known as g or ‘the positive manifold’ – the tendency for scores on different cognitive tests to correlate. Indeed, it is rare that a cognitive process or phenomenon is found to be independent of g (Carroll, 1993), so it is reasonable to assume that the construct of general intelligence encompasses most of cognition.

It is revealing that when critics of IQ tests try to argue that such tests fail to assess many essential domains of psychological functioning, they often point to non-cognitive domains, including: socio-emotional abilities, motivation, empathy, morality and interpersonal skills. By targeting these non-cognitive domains, these critics indirectly bolster the assumption that IQ tests exhaustively encompass the cognitive domain.

IQ ? RQOur research group has challenged IQ tests much more fundamentally than the average critic. Our argument is that intelligence, as conventionally measured, leaves out critical cognitive domains – domains of thinking itself.

We were led to this conclusion through our long-standing interest in the heuristics and biases research programme inaugurated by Kahneman and Tversky several decades ago (Kahneman & Tversky, 1972, 1973; Tversky & Kahneman, 1974). In 2002 Kahneman won the Nobel Prize in Economics (Tversky died in 1996). A press release from the Royal Swedish Academy of Sciences drew attention to the roots of the award-winning work in ‘the analysis of human judgment and decision-making by cognitive psychologists’. Kahneman was cited for discovering ‘how human judgment may take heuristic shortcuts that systematically depart from basic principles of probability. His work has inspired a new generation of researchers in economics and finance to enrich economic theory using insights from cognitive psychology into intrinsic human motivation.’

One reason that the Kahneman and Tversky work was so influential was that it addressed deep issues concerning human rationality. As the Nobel announcement noted, ‘Kahneman and Tversky discovered how judgment under uncertainty systematically departs from the kind of rationality postulated in traditional economic theory’. The thinking errors uncovered by Kahneman and Tversky are thus not trivial errors in a parlour game. Being rational means acting to achieve one’s own life goals using the best means possible. To violate the thinking rules examined by Kahneman and Tversky has the practical consequence that we end up less satisfied with our lives.

The work of Kahneman and Tversky, along with that of many other investigators, has shown how the basic architecture of human cognition makes all of us prone to these errors of judgement and decision making. But being prone to these errors does not mean that we always make them. Every person, on some occasions, overrides the tendency to make these reasoning errors and instead makes the rational response. More importantly, our research group has shown that there are systematic differences among individuals in the tendency to make errors of judgement and decision making.

The fact that there are systematic individual differences in the judgement and decision-making situations studied by Kahneman and Tversky means that there are variations in important attributes of human cognition related to rationality – how efficient we are in achieving our goals. It is curious that none of these critical attributes of human thinking are assessed on IQ tests (or their proxies such as academic ability tests) given that the type of ‘good thinking’ studied by Kahneman and Tversky was deemed worthy of a Nobel Prize. This anomaly is all the stranger given that most laypeople and scientists are prone to think that IQ tests are tests of ‘good thinking’.

What is rationality?To think rationally means taking the appropriate action given one’s goals and beliefs (instrumental rationality), and holding beliefs that are commensurate with available evidence (epistemic rationality). Collectively, the many tasks of the heuristics and biases programme – and the even wider literature in decision science – comprise the operational definition of rationality in modern cognitive science (Stanovich, 2011). Psychologists have extensively studied aspects of instrumental rationality and irrationality and epistemic rationality and irrationality (see box for examples)

In short, we have an extensive and rich set of operationalisations for the concept of rationality in modern cognitive science. None of these operational measures are assessed on common IQ tests. Yet people (including scientists) often talk as if they were. For example, many conceptions of intelligence define it as involving adaptive decision making. Adaptive decision making is the quintessence of rationality, but the items used to assess intelligence on widely accepted tests bear no resemblance to measures of decision making.

However, there is an important caveat here. Although the tests fail to assess rational thinking directly, it could be argued that the processes that are tapped by IQ tests largely overlap with variation in rational thinking ability. Perhaps intelligence is highly associated with rationality even though tasks tapping the latter are not assessed directly on the tests. Here is where empirical research comes in, some of which has been generated by our own research group. We have found that many rational thinking tasks show surprising degrees of dissociation from cognitive ability in university samples. Myside bias, for example, is virtually independent of intelligence (Stanovich et al., 2013). Individuals with higher IQs in a university sample are no less likely to process information from an egocentric perspective than are individuals with relatively lower IQs. Many classic effects from the heuristics and biases literature – base-rate neglect, framing effects, conjunction effects, anchoring biases, and outcome bias – are also quite independent of intelligence if run in between-subjects designs (Stanovich & West, 2008). Correlations with intelligence have been found (e.g. Bruine de Bruin et al., 2007; Stanovich, 2009; Stanovich & West, 1998, 2000) to be roughly (in absolute magnitude) in the range of .20 to .35 for probabilistic reasoning tasks and scientific reasoning tasks measuring a variety of rational principles (covariation detection, hypothesis testing, confirmation bias, disjunctive reasoning, denominator neglect, and Bayesian reasoning). In fact, even after corrections for reliability and range restriction, this is a magnitude of correlation that allows for substantial discrepancies between intelligence and rationality. Intelligence is thus no inoculation against many of the sources of irrational thought.

Developing a rationality testIf we want to assess differences in rational thinking, we need to assess the components of rational thought directly, with an RQ (rationality quotient) test. Practically, in terms of the cognitive technology now in place, this is doable. There is nothing conceptually or theoretically preventing us from developing such a test. We know the types of thinking processes that would be assessed by such an instrument, and we have in hand prototypes of the kinds of tasks that would be used in the domains of both instrumental rationality and epistemic rationality. Thus, there are no major roadblocks preventing the development of an RQ test. Indeed, this is what our research lab is doing with the help of a three-year grant from the John Templeton Foundation. Specifically, we are attempting to construct the first prototype of an assessment instrument that will comprehensively measure individual differences in rational thought.

Rational thought can be partitioned into fluid and crystallised components by analogy to the fluid and crystallised forms of intelligence described by the Cattell/Horn/Carroll theory of intelligence (Carroll, 1993; see Table 1. page 82, PDF version). Fluid rationality encompasses the process part of rational thought – the thinking dispositions of the reflective mind that lead to rational thought and action. Crystallised rationality encompasses all of the knowledge structures that relate to rational thought.

Unlike the case of fluid intelligence, fluid rationality is likely to be multifarious – composed of a variety of different cognitive styles and dispositions. As a multifarious concept, fluid rationality cannot be assessed with a single type of item in the manner that the homogeneous Raven Progressive Matrices, for example, provides a good measure of fluid intelligence.

The concept of crystallised rationality has two subdivisions, as shown in Table 1. Knowledge structures that promote rational thought are termed crystallised facilitators. Knowledge structures that impede rational thought are termed crystallised inhibitors. Each of these subcategories of crystallised rationality is, like fluid rationality, multifarious. Without learning crystallised facilitators, people will lack declarative knowledge that is necessary in order to act rationally. However, not all crystallised knowledge is helpful – either to attaining our goals (instrumental rationality) or to having accurate beliefs (epistemic rationality). Hence the category of crystallised inhibitors (e.g. astrology) in the table.

Table 1 should not be mistaken for the kind of ‘good thinking styles’ lists that appear in textbooks on critical thinking. In terms of providing a basis for a system of rational thinking assessment, it goes considerably beyond such lists in a number of ways. First, many textbook attempts at these lists deal only with aspects of fluid rationality and give short shrift to the crystallised knowledge bases that are necessary supports for rational thought and action. In contrast, our framework for rationality assessment emphasises that crystallised knowledge underlies much rational responding (crystallised facilitators) and that crystallised knowledge can also be the direct cause of irrational behaviour (crystallised inhibitors).

More importantly, the conceptual components of the fluid characteristics and crystallised knowledge bases listed in Table 1 are each grounded in a task or paradigm from cognitive science. That is, they are not just potentially measurable, but in fact have been operationalised and measured at least once in the scientific literature – and in many cases (e.g. context effects in decision making; probabilistic reasoning) they have been studied extensively.

Most of the paradigms that will be used in our assessment device are therefore well known to most cognitive psychologists. For example, there are many paradigms that have been used to measure the resistance to ‘miserly information processing’, the first major dimension of fluid rationality in Table 1. In the Cognitive Reflection Test, designed by Shane Frederick (2005), for example, the most famous item reads: A bat and a ball cost £1.10 in total. The bat costs £1 more than the ball. How much does the ball cost? When they answer this problem, many people give the first response that comes to mind – 10 pence – without thinking further and realising that this cannot be right. The bat would then have to cost £1.10 and the total cost would then be £1.20 rather than the required £1.10. People often do not think deeply enough to realise their error, and cognitive ability (as measured by IQ) is no guarantee against making the error. Frederick (2005) found that large numbers of highly select university students at the MIT, Princeton, and Harvard were cognitive misers – they responded that the cost was 10 pence, rather than the correct answer… 5 pence.

The cognitive miser tendency represents a processing problem of the human brain – it is a problem of fluid rationality. The second broad reason that humans can be less than rational derives from a content problem – when the tools of rationality (probabilistic thinking, logic, scientific reasoning) represent declarative knowledge that is often incompletely learned, inaccurate or not acquired at all. To illustrate how assigning the correct probability values to events is a critical aspect of rational thought, consider the following problem in which both medical personnel and laypersons are often caught out:

Imagine that the XYZ virus causes a serious disease that occurs in 1 in every 1000 people. There is a test to diagnose the disease that always indicates correctly that a person who has the XYZ virus actually has it. However, the test has a false-positive rate of 5 per cent – the test wrongly indicates that the XYZ virus is present in 5 per cent of the cases where it is not. Now imagine that we choose a person randomly and administer the test, and that it yields a positive result (indicates that the person is XYZ-positive). What is the probability that the individual actually has the XYZ virus?

The point is not to get the precise answer so much as to see whether you are in the right ballpark. The answers of many people are not. The most common answer given is 95 percent. Actually, the correct answer is approximately 2 per cent! Why is the answer 2 per cent? Of 1000 people, just one will actually be XYZ-positive.

If the other 999 are tested, the test will indicate incorrectly that approximately 50 of them have the virus (.05 multiplied by 999) because of the 5 percent false-positive rate. Thus, of the 51 patients testing positive, only one (approximately 2 per cent) will actually be XYZ-positive. In short, the base rate is such that the vast majority of people do not have the virus. This fact, combined with a substantial false-positive rate, ensures that, in absolute numbers, the majority of positive tests will be of people who do not have the virus.

Rational thinking errors due to such knowledge gaps can occur in a potentially large set of domains including probabilistic reasoning, causal reasoning, knowledge of risks, logic, practical numeracy, financial literacy, and scientific thinking (the importance of alternative hypotheses, etc.).

In other publications (e.g. Stanovich, 2011) we have provided numerous examples of tasks like those described above that measure each of the rational thinking concepts in Table 1. Our framework illustrates the basis for our position that there is no conceptual barrier to creating a test of rational thinking. However, this does not mean that it would be logistically easy. Quite the contrary, we have stressed that both fluid and crystallised rationality are likely to be more multifarious than their analogous intelligence constructs. Likewise, we are not claiming that there presently exist comprehensive assessment devices for each of these components. Indeed, refining and scaling up many of the small-scale laboratory demonstrations in the literature will be a main task of our research. Our present claim is only that, in every case, laboratory tasks that have appeared in the published literature give us, at a minimum, a hint at what comprehensive assessment of the particular component would look like.

The ability to measure individual differences in rational thinking could have profound social consequences. In a recently published book (Stanovich, 2011), we showed how shortcomings in each of the subcomponents of rational thought has been linked to a real-life outcome of practical importance, including: physicians choosing suboptimal medical treatments; people failing to accurately assess risks in their environment; the misuse of information in legal proceedings; millions of dollars spent on unneeded projects by government and private industry; parents failing to vaccinate their children; unnecessary surgery; billions of dollars wasted on quack medical remedies; and costly financial misjudgements (Baron, 2008; Stanovich, 2009).

Likewise, a flurry of recent books from researchers such as Dan Ariely, Richard Thaler and Cass Sunstein have outlined practical, real-life thinking domains where people obtain suboptimal outcomes because they make rational thinking errors. For example, suboptimal investment decisions have been linked to overconfidence, the tendency to over-explain chance events, and to allow emotions to cloud judgement – all components of our rational thinking test. It is critically important to realise that intelligence has been shown to be an insufficient inoculation against these thinking errors and their negative consequences.

In summary, we have coherent and well operationalised concepts of rational action and belief formation (for example, the Nobel-winning work of Kahneman and much related research). We also have a coherent and well operationalised concept of intelligence. No scientific purpose is served by fusing these concepts, because they are very different. To the contrary, scientific progress is made by differentiating concepts. We have a decades long history of measuring the intelligence concept. It is high time we put equal energy, as a discipline, into the measurement of a mental quality that is just as important – rationality.