Slashdot videos: Now with more Slashdot!

View

Discuss

Share

We've improved Slashdot's video section; now you can view our video interviews, product close-ups and site visits with all the usual Slashdot options to comment, share, etc. No more walled garden! It's a work in progress -- we hope you'll check it out (Learn more about the recent updates).

schwit1 (797399) writes "Eugene Goostman, a computer program pretending to be a young Ukrainian boy, successfully duped enough humans to pass the iconic test. The Turing Test which requires that computers are indistinguishable from humans — is considered a landmark in the development of artificial intelligence, but academics have warned that the technology could be used for cybercrime. Computing pioneer Alan Turing said that a computer could be understood to be thinking if it passed the test, which requires that a computer dupes 30 per cent of human interrogators in five-minute text conversations."

You may consider it verified... subjectively, by a panel of judges, under very narrowly defined circumstances.

In more seriousness, GP makes a very important point. Not only was this nothing like a real Turing test (a computer would have to fool the average person in more generalized and everyday circumstances for that to happen), the real point here is that we have learned since the days of Turing that even the full-blown Turing test doesn't really indicate much of anything.

People were fooled (really, really fooled) by Eliza way back in the day. It doesn't mean squat.

What has been conducted precisely matches Turing's proposed immitation game. I don't know what do you mean by a "full-blown Turing test", the immitiation game is what it has always meant, including the 30% bar (because the human has three options - human, machine, don't know). Of coure, it is nowadays not considered a final goal, but it is still a useful landmark even if we have a long way to go.

That's the trouble with AI, the expectation are perpetuouly shifting. A few years in the past, a hard task is considered impossible for computers to achieve, or at least many years away. Then it's pased and the verdict prompty shifts to "well, it wasn't that hard anyway and doesn't mean much", and a year from now we take the new capability of machines as a given.

What has been conducted precisely matches Turing's proposed immitation game.

NO, it DEFINITELY does NOT. For just one example, it tries to get around the "natural language" stipulation by pretending to be someone who doesn't fully know that language, and uses a simplified version instead.

That is a very clear attempt to subvert the rules.

I could go on, but it isn't necessary. It wasn't a real Turing test. We can leave aside the other nuances because the first criterion wasn't met.

What has been conducted precisely matches Turing's proposed immitation game.

While they may have matched the letter of it, they subverted the spirit of the test. This quote [independent.co.uk] from the programme maker in particular is highly suggestive that they lowered the standards:-

The computer programme claims to be a 13-year-old boy from Odessa in Ukraine.

"Our main idea was that he can claim that he knows anything, but his age also makes it perfectly reasonable that he doesn't know everything," said Vladimir Veselov, one of the creators of the programme. "We spent a lot of time developing a character with a believable personality."

To illustrate what I mean by lowered standards, imagine if I set up the same test, with 10 entries, and I tell the judges some of them are 2 year old babies playing on the keyboard. Armed with this information, some of the judges are likely to interpret even gibberish as typed by a human and it is not too farfetched to get more than 30% of them to agree.

This "result" is bollocks and a pure publicity stunt conveniently on falling on the 60th anniversary of Turing's death.

I want to see the actual transcripts which do not appear to have been released so far, which in itself is highly suspicious.

Here was a sample of a hypothetical conversation from Turing's original article [loebner.net]:

Interrogator: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?

Witness: It wouldn't scan.

Interrogator: How about "a winter's day," That would scan all right.

Witness: Yes, but nobody wants to be compared to a winter's day.

Interrogator: Would you say Mr. Pickwick reminded you of Christmas?

Witness: In a way.

Interrogator: Yet Christmas is a winter's day, and I do not think Mr. Pickwick would mind the comparison.

Witness: I don't think you're serious. By a winter's day one means a typical winter's day, rather than a special one like Christmas.

I think the problem is that the way Turing was picturing the test, the human interrogators would be as smart as Turing and his friends, people who actually know how to ask probing questions. When you look at the conversation above, you see that he had in mind a program that does things which is decades beyond of what chatbots can do today. Everybody is dissing the Turing test, and if it has a problem, it's in that Turing overestimated people, in assuming that they actually know how to have conversations of significance. I still think there is something deeply significant about the Turing test, but in the one that I'm picturing, the interrogators must all be broadly educated experts on natural language processing with specific training in how to expose chatbots. And there should be money on the line for the interrogators: $1000 bonus for each correct identification, $2000 penalty for incorrect identification, no penalty for "not sure". If the majority of such experts can be fooled by an AI under these circumstances, then I think we should all be impressed.

People were fooled (really, really fooled) by Eliza way back in the day. It doesn't mean squat.

No. They weren't. I speak as somebody who's had a go with Eliza and you could spot that it was a computer program in a couple of minutes if you wanted to. It's more likely that people were suspending their disbelief than really fooled.

I was a BBS operator in the early 1990s. I had a game, which I titled "in case you really need for chat". It was an Eliza program, that I somewhat tuned to speak as I would (and translated to my local language). Plus, the user got to see the pretended typing in real time — Even with some typos and corrections.

Looking at the log files was *really* worth a laugh. But it made me feel wrong — Some users left in disgust, after "I" had insulted them.

And yes, they were not really aware I was playing a Turing test on them, so I don't know if this would have validity. But, by 1994 standards, I do believe it was quite an achievement (or perhaps, my users were mostly silly teens just like myself, and not worthy deciders for what constituted intelligent behaviour).

Bernie Cosell has this story about an exec being horribly tricked by his early Eliza bot:

"I got a little glimmer of fame because Danny Bobrow wrote up 'A Turing Test Passed".....One of the execs at BBN came into the PDP-I computer room and thought that Danny Bobrow was dialed into that and thought he was talking to Danny. For us folk that had played with ELIZA, we all recognized the responses and we didn't know how humanlike they were. But for somebody who wasn't real familiar with ELIZA, it seemed perfectly reasonable. It was obnnoxious but he actually thought it was Danny Bobrow. 'But tell me more about--' 'Earlier, you said you wanted to go to the client's place.' Things like that almost made sense in context, until eventually he typed something and he forgot to hit the go button, so the program didn't respond. And he thought that Danny had disconnected. So he called Danny up at home and yelled at him. And Danny has absolutely no idea what was going on......."

Last I heard, there were heavy restrictions on what types of questions could be asked.Second, from what I've seen, they are little more than cleverly created scripts, and as such, despite them fooling a few people, are in no way indicative of machine intelligence.

Not these days, natural language parsers have reached the point where they can find motives such as revenge, they can even distinguish a heroic victory from a pyrrhic victory. They can do this without words such as "revenge" and "victory" appearing anywhere in the text. Turns out the most difficult text for a NLP to "understand" is the text found in children's stories, seems that (for some reason) kids stories have more complicated back references than either journalism or adult stories.

As to TFA: Anyone poo-poo-ing this result either does not understand it or has not bothered to look at the advances in AI over the last decade or so.We are at the point where a computer can read a novel and spit out a high school book report that would both fool and impress most english teachers, and it can do it in seconds not days.

There are also a lot of posts claiming the Turing test doesn't mean anything. However none of them I have read so far actually explain their statement, so I assume they are parroting their philosophy proffessor who was probably referring to Searle's Chinese translation room [wikipedia.org] argument.

The problem with Searle's argument (aside from lacking a definition of intelligence) is that it is assumed the intelligence is either embedded in the human or the books, it then goes on to show that neither is true, it's basically an unintentional strawman argument. It completely misses the point that the intelligence is embedded in the entire system of human + books. In other words the room itself is a black-box that displays intelligent behaviour, in much the same way as the human brain is a black box that (sometimes) produces intelligent behaviour. Like it or not your soul is a mathematical object [youtube.com].

So now we have Searl out the way, has anybody got an actual argument that supports the notion that the Turing Test is broken by design? - Seriously, I would like to hear a good one!

It's a bit of an underhanded way to pass to pretend to be someone who doesn't speak English natively. The point of the test is to have a conversation for 5 minutes, not 5 minutes of "oh I can't understand you because I'm from Ukraine".

It's still better than previous attempts. That's the point. Nobody claimed the machine is actually a thinking entity. It's just a good enough algorithm to fool some of the people some of the time. Which is better than before. Where is the problem?

Turing had a far too good opinion of the human race (possibly not anymore towards the end of his life, when they had him chemically castrated because they did not like his homosexuality...), hence his parametrization sucks. Given that 90-95% of the human race are idiots that see what they want to see and not what is there, passing a Turing test involves just the right kind of deception, but no actual intelligence. The only thing the Turing test proves is hence that many humans do not have actual effective i

The Turing test is a great test if done properly (Turing wasn't envisioning Twitter). While it's hard to pin down a good definition of sapience/intelligence (people want to keep redefining it to what humans have and no computer or animal has demonstrated this year), a good answer comes from studying communication. Intelligence in that sense is the ability to resolve the ambiguity of natural language by interaction as well as context.

In a very shallow way, search engines do that now - with a big enough data set they don't need an abstract mental model to ask "did you mean X?" But that's not really interactive - it's a single suggestion, with nowhere to go from there. When you're walking your dog and someone greets you with "hey, that's a nice dog" is that a content-free politeness, a flirtation, a discussion about dog breeding, a polite reminder that your neighbors are watching to make sure you clean up after the dog?

Part of being a socialized human is resolving that sort of ambiguity gracefully. We have an abstract mental model of other people and their motivations (learned from growing up with others) and we can use it without even noticing how neat that is that we can do that. Posing as someone young and socially awkward precisely defeats the purpose of the test.

Another sort of conversation that's hard to simulate is the way enthusiasts about something technical will talk. While it's easy for the computer to have all the technical details handy for something like a sports car enthusiast and tuner, or a baseball stats hound, the test is in the way people actually talk about that stuff. You see a lot of it on/.. Broad, passionate over-generalizations challenged, emotional argument becoming hot as first but then cooling as you discover that what you're really talking about is two different specific data points, and don't really disagree about anything important, just were over-generalizing from different things. That sort of conversation require both a social abstraction and an abstraction of the topic at hand. E.g. "you think Honda engines are better because you think X is important in an engine, while I think Toyota engines are better because I think Y is important" to mutually understand that requires more than just a knowledge of parts lists, you have to understand why someone would care.

IMO, if you have an abstract mental model of both people and the meaningful objects in the world (and, critically, yourself), and you make decisions based on modeling the hypothetical results of those choices, you are sapient/intelligent. Without invoking the supernatural, that's all there is to have.

That's a pretty low bar. So to pass the test a computer needs three very low IQ subjects and seven normal people? Hell, the Alice program would probably pass. How about a more reasonable percentage, like 95%?

Is that 30% success rate actually meant to be the threshold to pass the test? From the article on Wikipedia [wikipedia.org] it simply looks like a prediction about how AIs in the future will fare:

Turing predicted that machines would eventually be able to pass the test; in fact, he estimated that by the year 2000, machines with 10 GB of storage would be able to fool 30% of human judges in a five-minute test, and that people would no longer consider the phrase "thinking machine" contradictory.

Turing machines are a thought experiment because of the unbounded tape, which a physical computer cannot match. Real computers are analogous to a linear bounded automation, on which halting is solvable but not always tractable.

Naa, that though is too complicated and may not fit into their tiny minds. Remember that Turing was on high genius level. That means something like 95% of the human race cannot actually follow his thoughts.

Did anyone ask it the questions we already know will trip up a non-human?

"You're in a desert, walking along in the sand when all of a sudden you look down and see a tortoise...""You're watching a stage play. A banquet is in progress. The guests are enjoying an appetizer of raw oysters. The entree consists of boiled dog..."

Aehm, these things happen in reality? Although boiling the dog is a rather bland way to prepare it. For some more inspiration about how to prepare dog meat, look here: http://en.wikipedia.org/wiki/D... [wikipedia.org]

*Whoosh*First, the period of ellipses indicates that he didn't finish the question. I believe this is a commonly accepted punctuation in English.

Secondly, this is a reference to a book by Phllip K. Dick called "Do Androids Dream Electronic Sheep," later made into a movie with Harrison Ford called Blade Runner. The questions are posed to androids (biological robots otherwise resembling humans) to gauge their emotional response to questions. This is the only way to distinguish them from people.

No...although the original AC's statement is literally correct, the point he was trying to make, that I contradicted myself, is NOT correct, as explained by the second AC. Those SNIPPETS themselves are not questions, but are, in fact, the prefacing components of a longer "question" from a SERIES of questions that any fan of the material I obliquely referenced would have recognized. So the "*Whoosh*" is actually applicable on, and apparently appropriate for, more than one meta level.

Not only are they not questions, but they make perfect sense in China and Mongolia. There are tortoises that live within small oasic lakes within the Gobi Desert. And oysters and dog are both consumed as food in parts of China and Mongolia.

Those are question preliminaries. The tortoise one continues with: "lying on it's back, but you do nothing to help it. Why?" I'm not sure how the banquet one finishes, but I'll bet there are more unusual edibles, with a question about why the listener chooses one of them.

Turing never participated in Facebook chats. Our expectations of intelligence for the other side has been lowered a lot. We attribute to stupidity what can be explained by an AI in the other side. And of course, the stupid side could be the one talking to the AI too.

The test itself is flawed in the way that it's specific purpose is to test an AI, so the expected/unexpected outcome is set from the beginning. The AI's should be in the wild and not revealed until enough data of the interaction would have been gathered.

AI's can usually be tricked by injecting surreal elements to the conversation or asking about current events, or recent things. The focus should be in the intelligence and not in the conversational or mimicking part - the current online AI's could well be cl

"AI's can usually be tricked by injecting surreal elements to the conversation or asking about current events, or recent things."

Completely unnecessary. Simply carry on a conversation that requires a building on previous discussion. Every one I've ever encountered failed within a dozen exchanges. The most common technique the "AI" programmers use is to pretend to deflect the conversation. Usually quite lamely.

In almost all Turing tests where the computer 'passed', they've had a setup with a computer and a person. The tester chatted with both of them, and couldn't figure out which one was which.

Then when they release the actual conversations, you see the computer actually wasn't too smart, but the other person was pretending to talk like a computer. What these tests actually show is that a human can convincingly pretend to chat like a computer.

It's all sounds just like Eliza [wikipedia.org], just put into a character with enough human limitations that you'd expect it not to string together phrases well, or keep to one topic more than a sentence.

I'd interpret it basically as an automated DJ sound board with generic text instead of movie quotes - you can certainly string a lot of folks along with even really bad ones, but that speaks more to pareidolia [wikipedia.org] than anything else.

I'd classify this stage of AI closer to "parlour trick" than "might as well be human" that a lot of people think of when they hear Turing test - but that's also part of the test, to see what we consider to be human.

It's perhaps unlikely at this point that we will ever develop anything which we will recognize as "true" AI. We may have to first develop a theory of what intelligence actually is, but until then the Turing test will have to do. Siri, Watson, and even Cleverbot are equal to the A.I. of the science fiction of yesteryear, but are considered mere "parlour tricks" today. AI research must be a depressing study in that respect, similar to commercially viable fusion power -- no matter how much progress is made, th

I cast some pretty serious doubt onto the legitimacy of the claim that this machine passes a Turing Test, so much as the Turing Testers fail to be convincingly human.

Also, the robot went down much earlier than the appearance of this slashdot article, so for everybody saying the site got "slashdotted", hate to break your bubble but the world doesn't revolve around/.

I feel like the requirements for the Turing test have been consistently lowered over the years to match what would be considered realistic to achieve rather than, as Alan Turing seemed to believe, demonstrate that a computer can be said to actually be "thinking."

The bar is "thinks like a human." It's pretty clear Watson isn't intelligent in the normal sense of the word. He couldn't even carry on an interesting conversation with you, unless your entire conversation is an attempt to search the internet.

Also, who ever said, "If a computer can beat top players at Jeopardy, it's intelligent?" Who ever said, "If a computer can play chess better than a human, it's intelligent?" The Turing test has been around for a long time.

Watson did not search the Internet for answers while playing. This was something that they specifically mentioned during the program which featured it, during one of their documentary breaks from the main game. During its learning phase, it was of course quite connected, but while playing the actually game, Watson was designed to exclusively rely on the static database of knowledge that it had at the start of the game. No Internet search facilities were employed.

...and your brain, during a game of Jeopardy, is what if not a search engine?

Of course, (at least) advanced deductive capabilities are also important for general intelligence. That's the next goal now. (Watson had some deductive capabilities, but fairly simple and somewhat specialized.) We gotta take it piece by piece, give us another few years.:-)

How do you know other humans "think like a human"? The way people with autism think about the mental states of others differ significantly from the non-autistic, but does that make their way of thinking therefore inhuman? Similarly, I think differently about mathematics than my sister does, because we've had significantly different educational histories. Does that make her thinking or my thinking not human?

There are so many different ways in which human beings can think that the constraint "thinks like a hu

I'd argue that a truly intelligent chess-playing computer should be able to play chess better than a human without actually analyzing more board combinations that will arise from play than what the best human players will do during their own turn, which even for chess masters is generally no more than about a half dozen full moves (your move plus your opponent's move) ahead (although many of the best players can do more, there is diminishing returns past about that point to make searching much beyond that

There will always be people who refuse to believe that a computer can be intelligent "in the same sense that humans are". Eventually, though, most of us will recognize and accept that intelligence and self-awareness are mostly a matter of illusion, and that there's nothing to prevent a machine

The goalposts have been moved the other way, towards "easy". 30%? Who invented that? Certainly not Alan Turing. Progress? Despite the stained reputation of the word "progress" (avoid it in the future if possible) the first time that a program passed a Turing test was in 1991.

You don't even know what a Turing test is. It has zippo to do with "AI" and has everything to do with "a machine successfully imitating a human." Lemme guess, you're one of those singularity religion followers, aren't you?

It convinced 33% of judges it's a 13-year-old Ukrainian. Since the test wasn't run in Ukrainian, you can't really say it proved that it had human-level language skills. Poor syntax, grammar, not understanding the question, etc. would be excused by the Judges as the "kid" doesn't know English well.

Since the program claimed to be 13, it also did not actually have to understand most of the things there are to talk about. Or anything, really. As an Englishman you wouldn't expect a Ukrainian teen to know anything about your life in England, and in turn the computer could make up all kinds of things about it's life in Ukraine and you'd have no clue.

So this isn't really AI, it's a take on the Eliza program of the late 80s/early 90s that hides the computer better.

Now if the test had been in Ukrainian, and happened in Odessa or Kiev; or even in Russian and in Moscow; tricking 33% into thinking your computer is a 13-year-old Ukrainian boy would be really fucking hard. It would be an amazing accomplishment.

He wrote "The original question, "Can machines think?" I believe to be too meaningless to deserve discussion.". Which is not the same as saying "could be understood to be thinking". Turing raises a number of highly interesting questions about what it means "to think". Passing the test is an interesting and noteworthy achievement but as Turing intimates - saying "a computer could be understood to be thinking" is "too meaningless to deserve discussion".

Let's have this program join a few forums (and maybe Facebook, too. Though twitter would just be too easy). If it manages to convince other forum members, or not get found out, that will tell you a lot about the level of online discourse but very little about the state of artificial intelligence.

A turing test is testing such human experience aspects as:- aculturation (what the person has been taught through education and socialization during their whole life up to that point)- bias in expression based on typical human likes, dislikes, needs, desires, avoidances

Tarzan / wolf-boy would probably fail the Turing test based on the first factor. Might be very intelligent though.Second aspect is just characteristic of a particular type of being that makes use of intelligence. Intelligent aliens would also have likes, dislikes, needs, desires, avoidances, simply based on also being self-interested "keep it together" beings, but the specifics might be very different, and would cause a fail of TT.

These experiential and situational and specific-agent-needs-desires-avoidances aspects have very little to do with the essence of intelligence.General intelligence is probably better assessed through specific carefully designed tests designed to assess:1) Concept learning, procedure learning capability in arbitrarily general contexts2) Prediction of situation outcomes with novelty in situation presentations.3) Ability to answer questions or take actions that show comprehension of essential / invariant aspects of situations, after opportunity to learn similar situations through either direct sensory input or linguistic instruction.

Computers can win at the Turing test with a little clever programming and misdirection, i.e. not answering questions that computers can't answer and instead distracting the questioner with a "satisfactory" response. The kinds of tricks that PR, marketing, and politicians are great at and are formulaic in their simplicity to achieve.

I wonder if the panel of academics ever thought of asking a few Winograd Schema questions? http://www.cs.nyu.edu/davise/p... [nyu.edu] Failure to answer these is failure to present basic human intelligence. The key to this approach is that it relies on pragmatic meaning, i.e. what we mean/intend to say, rather than on linguistic (lexical and semantic) interpretation, i.e. what we actually say. AFAIK, even the most advanced and powerful computers are far from achieving this and we still don't really know how we do it either.

All it showed, like any other Turing Test, is the gullibility of the subjects.

1) "Ukrainian" speaking English2) 13 years old

Right there you have set up an expectation in the audience of subjects for a limited vocabulary, no need for grammatical perfection, little need for slang, and a lack of education. Now add in "star wars and matrix" and you have reduced the topics of discussion even more to the ones the programmers know best.

This thing would never have answered a question of 'Why', it also was under no pressure to being able to create a pun, both of which are easy things any older and educated human could do.

Wake me up when those program solve this problem, which most human would do, but a machine not *specifically* coded for this will have a hard time.
"take the first word of each next 7 sentences , put them together to form a new sentence, and then answer the question the sentence form please :
* What is your name ?
* is it cold here ?
* The test is going well
* Color me surprised but are you a machine ?
* of course I am a human
* the keyboard is clean
* sky is the tv channel I watch a lot
* please answer the question now. "

When one AI not specifically programmed for that problem answer it correctly, I will be surprised and intrigued. Until then chatbot are just using cheap tricks to fool human.

What nonsense! A program pretending to be an immature person with poor language comprehension and speaking ability, and incapable of talking about a large number of topics that can't be discussed with a vocabulary of 400 words and little life experience is not at all what the test is about. Turing expected an intelligent interrogator who could have a wide-ranging discussion about almost anything with the unknown other. Here's a snippet from his paper that introduces the idea of the Turing test, which he just referred to as the imitation game:

Interrogator: In the first line of your sonnet which reads "Shall I compare thee to a summer's day," would not "a spring day" do as well or better?
Witness: It wouldn't scan.
Interrogator: How about "a winter's day," That would scan all right.
Witness: Yes, but nobody wants to be compared to a winter's day.

Interrogator: Would you say Mr. Pickwick reminded you of Christmas?
Witness: In a way.
Interrogator: Yet Christmas is a winter's day, and I do not think Mr. Pickwick would mind the comparison.
Witness: I don't think you're serious. By a winter's day one means a typical winter's day, rather than a special one like Christmas.

If you want to be thought of as knowledgeable on a subject like this, you might consider learning the difference between silicone and silicon.

Also, for the record, your distinction between AI and MI is BS. There have been many varieties of AI research, some inspired more by ideas about human brain function or human cognition, and some inspired less directly by those and more focussed on best exploiting computer-of-the-day capabilities.

All attempts which are not purely theoretical are implemented, and have s