A.I. robot scores higher than four year old on I.Q. test

The gap between human and artificial intelligence (AI) is becoming narrower and narrower. An A.I. system at the Massachusetts Institute of Technology, dubbed ConceptNet, recently took an I.Q. test for children and scored higher than the average four year old, and the system is expected to become smarter.

Machines are becoming better equipped at performing various tasks, including playing chess, recognizing pictures and holding intelligent conversations. ConceptNet was put through five tests, from word reasoning to vocabulary. When the results were in, ConceptNet achieved a score of 69, whereas the typical four year old scores an average of 50 on the same test. 1

ConceptNet is an open source project headed by the MIT Common Sense Computing Initiative. The aim of the project is to collect data about what people know and what machines don’t know, like how to relate words to concepts, and making them into a computer algorithm.1

Relating Concepts

The team tested ConceptNet4. When they asked the A.I. system to explain what an fawn was, it understood that a “fawn is a deer” rather than “a deer is a fawn.” In other words, the A.I. system knew how the concept of an animal related between the two words: a deer is an animal but an animal isn’t necessarily a deer.1

ConceptNet4 also uses what is called a polarity flag to identify a negative correlation such as, “penguins can’t fly.”1

The team used the Weschsler Preschool and Primary Scale of Intelligence Test (WPPSI) on the system. The WPPSI-III IQ test consists of 14 subtests and assesses Performance IQ and a Verbal IQ (VIQ).1

In order to measure Performance IQ, children are asked to draw pictures, solve puzzles and perform various memory tasks. The Verbal IQ test is designed to test a child’s vocabulary and the scope of their reasoning abilities.1

ConcpetNet had a wider vocabulary than the average four year old, scoring 20 versus an average of 13.1

The A.I. system wasn’t great at all these tests, however. A four year old performed better at Word Reasoning and Comprehension, scoring seven on both tests, whereas ConceptNet only scored three.

“ConceptNet 4 did dramatically worse than average on comprehension – the ‘why’ questions,” said Robert Sloan, professor and head of computer science at UIC, and lead author on the study. “If a child had scores that varied this much, it might be a symptom that something was wrong.”1

Gap between third person and first person data

A.I. systems are divorced from and devoid of the subjective, qualitative experience of conscious creatures. It should therefore be unsurprising that ConceptNet proved short in executing subjective calculations. For example, it’s easier for an A.I. system to compute that water freezes at 32 degrees Fahrenheit than it is to compute that ice is cold.1

Furthermore, many of the incorrect answers made by ConceptNet were not parallel to the incorrect answers made by children. For example, during the Word Reasoning test, the robot and child were supposed to infer the existence of a lion based upon the following clues: ‘This animal has a mane if it is male’, ‘this is an animal that lives in Africa,’ and ‘this a big yellowish-brown cat.’

The top five answers from ConceptNet included dog, farm, creature, home, and cat.1

The creature and cat answers ‘are in the vague neighborhood of lion’, said the researchers, ‘however, the other answers are clear violations of common sense’.

“Common sense should at least confine the answer to animals, and should also make the simple inference that, ‘if the clues say it is a cat, then types of cats are the only alternatives to be considered.”1