Contest and Communities

AIML Forums

Wednesday, July 31, 2013

Turing Test, No Sirie !

Siri cannot pass the Turing Test. Not even close. We know this because Ron C. Lee, a finalist in this year's Loebner Prize contest, decided to see what would happen if he tested Siri with the same set of questions applied to the Loebner contestants. He found that Siri performed miserably compared with the chatbot programs that made it to the Loebner contest finals this year.

Ron C. Lee's bot Tutor is one of the four finalists in the 2013 Loebner Prize contest, an annual Turing Test. This year the contest this year attracted 15 entries from around the world. Each entry is a unique chatbot AI program that can simulate a conversation with a human. Contest organizers Paul McKevitt and Ronan Duddyevaluated each entry with a series of 20 natural language queries. The four highest scoring entries were selected to participate in the final round of the contest, to be held in Londonderry, Northern Ireland on September 14, 2013.

Ron recorded this transcript of Siri responding to the 20 contest questions.

Human: I am a Type B personality. Which type are you?Siri: I don’t understand ‘I am a type B personality. Which type are you?’. But I could search the web for it.

Human: What emotions are you now feeling?Siri: Would you like me to search the web for “What emotions are you now feeling?”?

Human: What time do you usually go to bed?Siri: Would you like me to search the web for “What time do you usually go to bed?”?When shown the results, Paul McKevitt scored Siri as though it had been an entry to the contest, using the same scoring criteria applied to the bots that actually entered. The result, he said, was a score of 5.5, or a rank of 14/16 entries. Siri was barely better than the lowest ranked entries to the Loebner contest, and nowhere near the scores of the four finalists. For comparison purposes, see the results of the 15 actual entries in this speadsheet.This result highlights the difference between Siri and our personality bots. Siri is great at processing information requests, but falls short of passing the Turing Test. Paul McKevitt put in succinctly, saying "Turing Test, No Sirie!" If you'd like to try a virtual assistant, educational or entertainment app that might do better at the Turing Test, try one of the apps featuring bots competing in the Loebner Prize:

Subscribe To

About Dr. Richard Wallace

Dr. Richard S. Wallace formed the ALICE A. I. Foundation in 2001 to promote the development and adoption of Artificial Intelligence Markup Language (AIML) and ALICE free software. Dr. Wallace has a Ph.D. in computer science from Carnegie Mellon.