Intelligence Testing: Accurate or Extremely Biased?

In the early 1900s, psychologist Charles Spearman noticed that children who did well in one subject in school were likely to do well in other subjects as well, and those who did poorly in one subject were likely to do poorly across all subjects. He concluded that there is a factor, g, which correlates with testing performance (Spearman 1904). The g factor is defined as the measure of the variance of testing performance between individuals and is sometimes called “general intelligence”.

Later on, psychologist Raymond Cattell determined that there are two subsets of g, called fluid intelligence (denoted Gf) and crystallized intelligence (denoted Gc). Fluid intelligence is defined as abstract reasoning or logic; it is an individual’s ability to solve a novel problem or puzzle. Crystalized intelligence is more knowledge based, and is defined as the ability to use one’s learned skills, knowledge, and experience (Cattell 1987). It is important to note that while crystallized intelligence relies on knowledge, it is not a measure of knowledge but rather a measure of the ability to use one’s knowledge.
The first standardized intelligence test was created in 1905 by French Psychologist Albert Binet, as a method to screen for mental retardation in French schoolboys. The test measured intelligence by comparing an individual’s score to the average score of children his own age (Binet 1905). The test was later revised by Lewis Terman of Stanford University and named the Stanford-Binet Intelligence Scales. The Stanford-Binet is now in its fifth edition and includes five sections: fluid reasoning, knowledge, quantitative reasoning, visual-spatial processing, and working memory.

Since the Stanford-Binet, many other standardized intelligence scales have been developed. One of the most popular modern intelligence tests is the Raven’s Progressive Matrices (RPM) test (Raven, 2003). The test gives individuals a series of boxes, each containing shapes that change from box to box, and a box that is empty. The test taker must recognize the pattern that is shown and correctly identify the shape that should go in the empty box from a collection of options. Unlike the Stanford-Binet, RPM is entirely visual; the test taker does not have to answer written questions, meaning the measured IQ is not dependent on reading comprehension. This allows for better testing that eliminates variables such as native language, age, and possible reading disability.

A general example of the questions on the Raven’s Progessive Matrices test.

So what exactly are these IQ tests measuring? The Stanford-Binet measures g through tasks that measure both Gf and Gc. Because RPM is entirely non-verbal and puzzle based, it almost exclusively measures Gf.

Which brings us to the next question; are these tests effectively measuring g?

Since their creation, modern Western intelligence testing has shown a difference in average intelligence, varying from group to group; whites score higher than blacks, the rich score higher than the poor. In some tests, women and men score differently from task to task. Are these differences due to heritable differences in intelligence between race, gender, and socioeconomic status? Or are environment, schooling, and stigma to blame? Or, are the tests themselves flawed?

While intelligence tests claim to be culture-fair, none of the tests created so far are one hundred percent unbiased. As Serpell (1979) found, when asked to reproduce figures from using wire, pencil and paper, and clay, Zambian children performed better in the wire task, while English children performed better in the pencil and paper task. Each group did better in the medium to which they were more accustomed. Pencil and paper IQ tests may be intrinsically biased towards Western culture.

Furthermore, while African-Americans have historically scored lower than white Americans on intelligence testing, this gap as been lessening in recent years (Dickens and Flynn 2006). This could be the result of one of two things; the first possibility is that average intelligence is increasing in the black community at a higher rate than in the white community (measured intelligence has been steadily increasing across all groups due to the Flynn effect). However, it seems more likely that post-segregation, white and black cultures have been merging, and schools have been integrated, meaning that white and black children have a better chance of receiving the same education. If this is the case, IQ tests are either measuring knowledge more than the test creators think they do, or the tests are extremely culturally biased, but this bias is lessening due to assimilation of white and black culture in America.

Not only are intelligence tests culturally biased, but they also seem to be biased in favor of neurotypical individuals. For example, while typically developing individuals generally perform similarly on RPM and the Wechsler Adult Intelligence Scale (WAIS), individuals with Autism typically score higher on RPM than on WAIS (Bolte et al. 2009, Mottron 2004). This is because while RPM is a visual task, WAIS is almost entirely verbal. Individuals with autism seem to use visual strategies to solve tasks and therefore have difficulty on tasks that can only be solved verbally (Kunda and Goel 2010). While this phenomenon is typically seen as a cognitive deficit, it is important to note that autistic individuals outperform neurotypical individuals on some visual tasks.

Therefore, by only measuring one specific part of intelligence, some IQ tests portray autistic individuals as having a cognitive deficit. What if some disorders, such as autism, are not actually disorders, but simply a way of thinking that differs from what is considered “normal”?

For example, Dr. Temple Grandin, an autistic woman with a PhD in Animal Sciences, uses her incredible visual working memory to design cattle equipment that is much more humane and far less anxiety-inducing than previous models. Grandin says her autism allows her to see the world in pictures; her inner thoughts are entirely devoid of language, she simply thinks in extremely detailed movies. She says her visual memory and sensitivity to details has allowed her to be so good at designing things, because details that neurotypical people gloss over are extremely important to her and end up making a huge difference in the efficiency of the final product.

Temple Grandin utilized her incredible working memory to design humane cattle-holding equipment for the agriculture industry.

Autism may not be the only example of a disorder being mischaracterized. Studies have shown that children with ADHD on average have lower IQs than neurotypical children (Kuntsi, 2003). However, in his TEDx talk, Stephen Tonti, a senior at Carnegie Mellon, discusses why he believes ADHD is not a disorder, but simply a difference in cognition. Tonti argues that by viewing ADHD as a disorder implies that it needs to be fixed. He states that his ADHD makes him better at some tasks than neurotypical individuals, and that the world needs a diversity of cognition in order to run smoothly.

Therefore, while IQ tests are intended to measure intelligence, they often only measure one type of intelligence, and are therefore biased against certain groups of people. By trying to fit cognition into a box, IQ testing disvalues cognitive diversity. This may be causing negative impacts. By telling an individual that their intelligence is low when in fact it is simply different, we could not only be holding people back, but we might also be depriving the world of a diverse group of thinkers that could solve problems from a different perspective.

Even if current IQ tests are not fair across all groups, the future of intelligence testing may be brighter; as discussed previously on the Neuroethics Blog, fMRI intelligence testing could eliminate biases in intelligence testing. By observing testers’ thought processes in action, researchers would be able to see which brain pathways a subject recruits to solve a test, and whether he or she uses a visual or verbal approach to the question, thereby observing fluid and crystal intelligence in action.

Comments

I read this quickly, so I may have missed it. When considering whether a test is biased, it's also important to consider whether it's measuring something that's meaningful and important. And then, of course, it's important to make sure that judgments we base on the test are meaningful and unbiased against immaterial matters. This includes cultural biases, but I'm mostly thinking about the neurotypical bias you discuss. Inability to perform verbal tasks is indeed a great handicap in most social, academic, and professional situations, and so is a meaningful difference.

As far as the question about the tests being culturally biased, they almost certainly are, but again, some of that is likely due to cultural differences in acquiring basic skills that are also being tested. This is a challenge for our educational system, and a question about how we're using these scores, and in fine-tuning these scores, but just being "biased" is not an absolute dismissal of the scores or the idea of the scores.

I do get a little tired of people throwing this excuse up, scholastic effort has a much great effect than any possible cultural bias. My interest is for recruiting and bias or not, IQ tests have shown high correlation with performance across a wide range of job skills. If we are testing for performance and are to employ on merit IQ testing still has relevance.

IQ/SAT type tests are big determinants of who gets into college, and who gets jobs - It's a system that flagrantly overlooks factors like education, skills, motivation, work history, conscientiousness, or special accolades. It makes them a rather convenient rationalist tools, that are surely abused towards, the various 'study cultures' (Jews, Nigerians, Indians, Cubans, Asians) who emphasize strengths in (developed) empirical intuition - so god forbid they take jobs, they deserved, and were employed for, before the, 'education doesn't matter culture'. In fact, evolution selected for the development of education for the work force, so it should be illegal to give IQ tests for recruiting. But corporations love IQ, because they are interested in the general adaptability of an employee, not some guy who is going to crack a few big problems at the theoretical level. Unfortunately, that results in a lot of people who can spin complex calculus equations in their head, working sales jobs.

Emily,In your post, you shortly talk about how people with autism score higher on one type of IQ test than another because of the way they think and are able to process or reply to/with visuals, as opposed to verbal responses. You also asked the question “ What if some disorders, such as autism, are not actually disorders, but simply a way of thinking that differs from what is considered “normal”?”. Your question here made me think, maybe you are right, people with autism are highly intelligent, but the “normal” test given to them classifies them very low on the scale of intelligence, saying that, in some cases mentally retarded. Over the last few months, I have been researching IQ tests and how they may be biased against people with autism. As I was researching this, I stumbled upon an article titled “Hidden strengths” by Nicholette Zeliadt. In this article Zeliadt mentions a clinical neuropsychologist, Isabelle Soulières, and her studies to reveal autism's “hidden strengths”. During her studies, she individually tested a group of children and their memory skills. During these tests, Soulières showed each child a picture of a single clown, then one of two clowns, and asked each child to identify the clown from before. One autistic girl in particular didn’t point to either clown, as many of the others did, but instead went on to demonstrate that she had an excellent memory when she reached into a drawer for a toy she had played with two weeks prior. I found this one story over all the others to be very eye-opening to me, because we hold everyone to the same standards for IQ tests, but not everyone is capable of thinking the same way and not everyone can communicate what they are thinking, although they may understand the question or situation at hand. These things make IQ test unsuitable for those who aren't classified as “normal”. I hope this small bit of information adds to your thoughts on autism and how those with autism’s different ways of thinking causes people to believe people with autism aren't “normal”.