On June 7, 2014, a Turing-Test competition, organized by the University of Reading to mark the 60th anniversary of Alan Turing's death, was won by a Russian chatterbot pretending to be a Russian teenage boy named Eugene Goostman, which was able to convince one-third of the judges that it was human. The media was abuzz, claiming a machine has finally been able to pass the Turing Test.

The test was proposed by Turing in his 1950 paper, "Computing Machinery and Intelligence," in which he considered the question, "Can machines think?" In order to avoid the philosophical conundrum of having to define "think," Turing proposed an "Imitation Game," in which a machine, communicating with a human interrogator via a "teleprinter," attempts to convince the interrogator that it (the machine) is human. Turing predicted that by the year 2000 it would be possible to fool an average interrogator with probability of at least 30%. That prediction led to the claim that Goostman passed the Turing Test.

While one commentator argued that Goostman's win meant that "we need to start grappling with whether machines with artificial intelligence should be considered persons," others argued that Goostman did not really pass the Turing Test, or that another chatterbot—Cleverbot—already passed the Turing Test in 2011.

The real question, however, is whether the Turing Test is at all an important indicator of machine intelligence. The reality is the Imitation Game is philosophically the weakest part of Turing's 1950 paper. The main focus of his paper is whether machines can be intelligent. Turing answered in the affirmative and the bulk of the paper is a philosophical analysis in justification of that answer, an analysis that is as fresh and compelling today as it was in 1950. But the analysis suffers from one major weakness, which is the difficulty of defining intelligence. Turing decided to avoid philosophical controversy and define intelligence operationally—a machine is considered to be intelligent if it can act intelligently. But Turing's choice of a specific intelligence test—the Imitation Game—was arbitrary and lacked justification.

The essence of Turing's approach, which is to treat the claim of intelligence of a given machine as a theory and subject it to "Popperian falsification tests," seems quite sound, but this approach requires a serious discussion of what counts as an intelligence test. In a 2012 Communications article, "Moving Beyond the Turing Test," Robert M. French argued the Turing Test is not a good test of machine intelligence. As Gary Marcus pointed out in a New Yorker blog, successful chatterbots excel more in quirkiness than in intelligence. It is easy to imagine highly intelligent fictional beings, such as Star Trek's Mr. Spock, badly failing the Turing Test. In fact, it is doubtful whether Turing himself would have passed the test. In a 2003 paper in the Irish Journal of Psychological Medicine, Henry O'Connell and Michael Fitzgerald concluded that Turing had Asperger syndrome. While this "diagnosis" should not be taken too seriously, there is no doubt that Turing was highly eccentric, and, quite possibly, might have failed the Turing Test had he taken it—though his own intelligence is beyond doubt.

In my opinion, Turing's original question "Can machines think?" is not really a useful question. As argued by some philosophers, thinking is an essentially human activity, and it does not make sense to attribute it to machines, even if they act intelligently. Turing's question should have been "Can machines act intelligently?," which is really the question his paper answers affirmatively. That would have led Turing to ask what it means to act intelligently.

Just like Popperian falsification tests, one should not expect a single intelligence test, but rather a set of intelligence tests, inquiring into different aspects of intelligence. Some intelligence tests have already been passed by machines, for example, chess playing, autonomous driving, and the like; some, such as face recognition, are about to be passed; and some, such as text understanding, are yet to be passed. Quoting French, "It is time for the Turing Test to take a bow and leave the stage." The way forward lies in identifying aspects of intelligent behavior and finding ways to mechanize them. The hard work of doing so may be less dramatic than the Imitation Game, but not less important. This is exactly what machine-intelligence research is all about!

Comments

Aly Farahat

August 23, 2014 12:16

" As argued by some philosophers, thinking is an essentially human activity, and it does not make sense to attribute it to machines, even if they act intelligently."
I believe that, by understanding the dynamics of our brains attributed to what we call "thinking", we shall eventually be able to "imitate" or "simulate" "thinking" by a machine.
Unless you characterize "thinking", by definition, as a very specific activity where only human artifacts can take place, the behavioral, brain dynamics and motor aspects surrounding the act of thinking are reproducible by a "carefully crafted" physical system to simulate the mathematics of the brain.

CACM Administrator

October 07, 2016 01:21

The following letter was published in the Letters to the Editor in the April 2015 CACM (http://cacm.acm.org/magazines/2015/4/184688).
--CACM Administrator

We wish to clarify an account of the 2014 Turing Test experiment we conducted at the Royal Society London, U.K., as outlined by Moshe Y. Vardi in his Editor's Letter "Would Turing Have Passed the Turing Test?" (Sept. 2014). Vardi was referring to a New Yorker blog by Gary Marcus, rather than to our experiment directly. But Marcus had no first-hand experience with our 2014 experiment nor has he seen any of our Turing Test conversations.

Our experiment involved 30 human judges, 30 hidden humans, and five machines Cleverbot, Elbot, Eugene Goostman, JFred, and Ultra Hal; for background and details see http://turingtestsin2014.blogspot.co.uk/2014/06/eugene-goostman-machine-convinced-3333.html. We used social media to recruit judges and a variety of hidden humans, including males, females, adults, teenagers, experts in computer science and robotics, and non-experts, including journalists, lecturers, students, and interested members of the public.

Prior to the tests, the judges were unaware of the nature of the pairs of hidden entities they would be interrogating; we told them only that they would simultaneously interrogate one human and one machine for five minutes and that the human could be a male or female, child or adult, native English speaker, or non-native English speaker. We asked the hidden humans to be themselves, that is, to be human.

The 30 judges, each given an anonymous experiment identity labeled J1J30 interrogated five pairs of hidden entities. Likewise each human and machine was given a unique identity E1E35. We ran 150 "simultaneous comparison" Turing Tests in which we instructed the judges that their task was to determine which was human and which was machine in the pair, a decision to be made based solely on the responses the hidden entities posted in reply to what a judge said.

Eugene Goostman was not correctly identified as the machine in the pair in 10 of its 30 tests; that is, 10 judges did not recognize it was a machine. Eugene Goostman's personality is that of a 13-year-old boy from Odessa, Ukraine, a character we do not consider contrary to Alan M. Turing's vision for building a machine to think. In 1950, Turing said, "Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child's?"

The figure here includes one simultaneous conversation from the experiment, showing one of Judge J19's tests after that judge simultaneously interacted with two hidden entities, in this case E20 and E24. In this test, E20's responses to the judge were relayed to a message box displayed on the left of the judge's screen; E24's answers were relayed on the right. Timings and text are exactly as they were in the test.

So, could you "pass the test" and be able to say which of the two entities E20 and E24 is the human and which the machine?

Huma Shah
London, U.K.
Kevin Warwick
Reading, U.K.

-----------------------------------------

AUTHOR'S RESPONSE:

The details of this 2014 Turing Test experiment only reinforces my judgment that the Turing Test says little about machine intelligence. The ability to generate a human-like dialogue is at best an extremely narrow slice of intelligence.