A huggable, talking intelligent toy robot? Oh yeah!

Supertoy Robotics Ltd. has created Supertoy, aka “Teddy” — a cute, talkative robot teddy bear with a New-York-comedian attitude and a surprisingly realistic voice — and just started a Kickstarter campaign for it.

As you can see in the video, he converses fluently and naturally. He should be a huge hit with kids and adults alike.

As noted, to ramp up production, Supertoy Robotics is seeking crowdfunding on Kickstarter. A pledge of £39 (about $60) will get you one of the first bears about a month before they ship in December.

Supertoy Robotics’ Ashley Conlan, Kartsen Fluegge, and friend

According to England-based Supertoy Robotics’ CEO Ashley Conlan, Supertoy can speak 30 languages out of the box, and because it gets its smarts in real time from the web, the company can easily upgrade it and do updates, he explained to me.

The magic is in the software

The Supertoy technology is Siri-like, but better, because it’s not just a Q&A but a continuous conversation; and Teddy will also remember past discussions. It has “evolved” from Supertoy Robotics’ popular Jeannie chatbot (for iPhone, iPad, and Android) — but is far more advanced.

“The magic is in the software, not the robot,” Conlan explained to me. Assuming you have a smartphone data plan or WiFi, the server detects your language and converts your speech to text. Your question or comment then goes to the Supertoy Robotics server, which generates a natural-language audio reply that goes (via the smart phone or device) back to Teddy’s speaker.

Teddy “evolved” from Jeannie, a popular Siri-like app for iOS (iPhone, etc.) and Android, said Karsten Flügge, with more than 3 million downloads, created by Pannous (of which he is CEO). You can get a rough idea of how Teddy works in this impressive demo of Jeannie (formerly Voice Actions), but Teddy is even more realistic. (I’ve been “talking” to Jeannie for a few days on my iPad — and on an Android Nexus 7, where she’s more developed — and I’m impressed.)

Siri on steroids

Like Jeannie, you can ask questions or engage in conversation, and actually learn things. You can also ask it to make calls for you, give you a wakeup-call, find the nearest restaurant, launch apps, and other Siri-like stuff, and it “safely searches the web,” said Conlan.

“Supertoy is fun and playful for children and can also be their first introduction to the world of the Internet,” said Conlan. “It is designed to be used in a safe and child-friendly manner, so only age-appropriate information is shared.

“This allows children to learn from the vast amounts of educational material online, without parents having to worry. Being connected to the Internet also means Supertoy upgrades over time and becomes better and better as it learns. It reads bedtime stories, sings songs, and answers all those pesky ‘why’ questions.”

Supertoy Robotics also just announced Sunday a new role-playing feature. For example: a child says, “Teddy, let’s play cowboys.” Teddy role-plays like a cowboy, says things like “howdy partner,” and when the child has finished playing role play, he or she says, “Teddy, stop playing cowboy.” Teddy outfits (cowboy, wizard, marine, footballer, fisherman, etc.) will also be available.

However, “Supertoy is not scripted,” said Conlan. “I don’t know what he is going to say for sure.” (It does have a blacklist for NSFW words though.) “At the end of a demo, when I said ‘Bye Now,’ Supertoy responded, ‘Will that accomplish your objective?’ That was totally a surprise.”

Lovely educational toy. Its also a microphone connected to remote network. The device creators have full access to any data that is transmitted to their servers. The system is also prone to being hacked by a third party. In addition, considering what we know about the NSA and Skype and Verizon etc, this device could be used to snoop on (and communicate with) children or anyone within its vicinity. In the video the “toy” says… “soon I will be able to see and walk around”. So now you have a mobile camera that due to “natural language” will befriend unsuspecting children and could end up anywhere from the child’s bed to the bathroom.

I think this is how real innovation and improvement to technology actually happens. Imagine how much great feedback comes into a KS project like this one, or just read through the comments here below and you get some cool ideas and important issues to focus on. What I think is different/cool about this project is that it combines the basic idea of Teddy with the Phone or active Database. I think it may be silly to expect it to pass a turing test for an adult, but for a small child it sure might. I think if you can imagine that you can turn On/Off features like playing cowboy…or looking up useful facts on the internet then it suddenly is quite interactive and creates an amazing experience not done from any toy in the past. Imagine next that kids can rank their Joy with the toy, so that Teddy starts saving actual interactions… after a while a database of what questions get asked and what conversations are important, then teddy can learn or the DB at least can be hyper focused to provide better…more realistic type of conversations. As you type in Google it autofills to help you search or even help you spell… imagine that teddy could do this on the fly if the access to the phone/DB was fast enough or pattern recognition was good enough. Just saying… no evolution towards a turing test is made in one jump…this is clearly a stepping stone moment in that process….and cooler that it is funded by anyone who wants to be part of something special… NOT your tax dollars or some military funding thing that always gets stuck with messed up agendas… Just a toy that breaks the old limits using technology in clever new ways.

Maybe we have to crawl before we walk… maybe we could still learn a few things by watching children play…. I found these 3 quotes and thought they are fun to share here.

“Play is the highest form of research.” – Albert Einstein

“Play is often talked about as if it were a relief from serious learning. But for children play is serious learning. Play is really the work of childhood.” – Fred Rogers

“We don’t stop playing because we grow old; we grow old because we stop playing.” – George Bernard Shaw

The issue raised here was whether or not the toy will be capable of carrying on the kind of Turing-Test-passing conversation portrayed in the promotional video. If not, then the video is misleading and since that video is being used to raise money, it raises an ethical question.

Any quotes about “play” or arguments about how children can enjoy a toy which does not match the capabilities displayed in the video are side-stepping the real issue.

I’m also pretty skeptical about this video. It seems that if the technology existed as portrayed in the video, then a company with the funding of a Google or a Microsoft or an Apple would be the one to introduce it.

Maybe if we saw random people in the street talking to the bear it would be more convincing, but as it is, it looks pretty staged. There’s a thin line sometimes between making your product look appealing and engaging in deceptive advertising.

A real AGI is not just a talk-bot. It also needs a body (which Boston Dynamics may proven a better agent at making this).
I as well think Ben Goertzel’s “psynet” approach may take too long (if ever) to achieve AGI.
Biologically Inspired Cognitive Architecture (BICA) is still the best way to go.

Sarah the voice you hear in the video is the voice that will be created as a full blown custom voice and commercially released. He will sound like that. As regards the a.i. it can be done, its one of the reasons we went on kickstarter to raise the money to complete development. The software is currently still in early prototype stage not even beta so please understand I am not going to say on this forum or anywhere else HOW we are going to do it, this is a business at the end of the day not a university. As regards your comment Mr.Nelson Ford please tell me your background? academic?

Karsten has already acknowledged that the final product will not be capable of a “truly AI complete” but that instead, it will give “often nonsensical random responses.” That’s all I’ve said. Well, that and my opinion that the promotional video is misleading (at best) when it shows the toy engaged in what I think anyone would believe is “truly AI complete” (not a single “nonsensical random responses” in the video).

But for the record (and without revealing any “trade secrets”), are you saying that his comments are incorrect and that it will be capable of the kind of conversation in the video?

As to my background: I have worked on Natural Language Processing for over 15 years. But since you stated my full name, I would have assumed that you had clicked on my user name (to the left of the message) and gone to my AI web site and already knew that, so why ask?

It has amazed me that for 45 plus years “computer speech” has not progressed to sounding like human speech, except to ‘string together’ human spoken sounds/words to improve ‘computer speech’ up until now. I was 10 – 12 or so when my dad introduced me to a husband/wife team of linguists studying the human vocal tract to duplicate it for ‘machine speech’ and ever since then ‘computer speech has always sounded ‘robotic’ regardless of the passage of years. Granted there is some ‘text reading software’ that sounds very good but there is No Mistaking It for human speech. It is still far far away from going over that ‘uncanny cliff’! I suppose it may be 50 years before ‘Teddy’ of Spielberg’s movie “AI” shows up on the scene… Give something ‘enough time’ and eventually it MAY happen even if it takes a few thousand years, but as long as you can put it far enough ahead to where people forget the promises and the failures (like ‘modern day’ “science”) then your funding should remain intact until the next round of Govt ineptitude rolls around. Good Luck!

IMHO, the real stumbling block in a truly natural sounding voice is a biggie: true understanding. The problem is, prosody and intonation depend on the meaning of what is being said.

Take the classic example “What are you doing?” vs. “What are you doing?” The proper intonation depends on the context in which it is being said. This requires an almost Turing-level understanding of the conversation.

Therefore, while I believe voice synthesis will improve, I don’t believe it will be indistinguishable from human speech until the turing test is passed.

In the toy demonstration in the video, I have to conclude that these are “canned” statements that the bear is saying. That is, each phrase has simply been pre-recorded by a person in advance. This obviously drastically limits the number of things it can say.

If the voice is real, it is exceptional. It sounds very very natural. That alone is worth more than incorporating into a toy. It is better than any other solution in existence as far as I know.

My nephew is slightly speech delayed. It seems like this toy could be really helpful with his speech development. But honestly, I am very skeptical this is real. I know your AI is not that amazing but instead has been advertised very selectively. However, the voice is astonishing, kudos. It doesn’t make any sense to sell such a great voice technology in a toy.

It would appear that the voice for the video was “staged” somewhat and that the responses are not as clear or make necessarily make sense.
The teddy is also only 10″ tall which means that it is all head and very little body – infact just big enough to get a phone into!
I was one of Supertoys biggest fans until I read that the voice and responses are not as portrayed in the video. I have unfortunately pulled my pledge as I am not quite sure this is all it is cracked up to be. If it is then my loss, if it isn’t …well

Why would he? And not to mention in the film AI teddy bear 3000 was along around before ted was. Supertoy is going to be class and I think these guys fair play coming up with great idea and let’s hope project is backed!

@nfordkrz you’re right that we should probably distinguish between “normal conversation” and “fluent conversation”, the first one being truly AI complete, the second one still being a novelty, and kids _love_ novelties.
Astonishingly even adults often seem to extract great joy from seemingly flat conversations, presumably _because_ of the often nonsensical random responses. And we have seen kids and adults get really excited once they realize that they are not just talking to a call&response Eliza/Siri, but to a system which already has _some_ unexpected deepness built in. We are not talking about a pletora of jokes and puns, but about a huge amount of AI experiments built-in, which can lead to hours of discovery.
Just try to approach it playfully, not expecting the Singularity to happen just yet.

I’m excited! (Somewhat!). When you get “Teddy Class AI” as in Steven Spielberg’s movie AI, talking walking seeing etc. Sign Me Up! But until then this first iteration might make a nice toy for a 3 1/2 to 4 year old. But the dialog will have to be as in the video, if not I promise there will be a lot of dissapointed people and maybe the cause of the ultimate failure of the project… Let’s hope not!

Well, you’re going to have a LONG wait to have anything resembling a normal conversation. That is the holy grail of Artificial Intelligence and it is highly unlikely that anything like Teddy is going to crack it anytime soon.

I got the Jeannie app mentioned in the article and here is a typical exchange:

Me: “It looks like it may rain today.”
J: “Interesting comparison.” (J pronounced it “com-par-shun”.)
Me: “How much water do daffodils need?”
J: “Perhaps an hour or two. How much time do you have?”
Me: “What do you think about wearing plaid with stripes?”
J: “Yes, I think about that often.”

The voice is read back by the actual app, driven by matching dialogue patterns. We are capable of modifying emotions in the TTS and we can utter special vocal expressions. The TTS is based on the same voice you hear in the video. Naturally how we do all this we cannot disclose until we release the consumer product.

“We” who? You must be decades ahead of what IBM has been able to do. Or Google. Or ConceptNet or OpenCog or any of the other big groups working on Natural Language Processing (and spending millions on it).

Anyone who can get a toy to carry on a truly random conversation (such as the one portrayed in the video) could sell/license it and make hundreds of millions, if not billions. The use of it in a toy would be almost infinitely out of proportion to its true value.

While I’m at it, I also find it disturbing that the name and look of the stuffed animal is so similar to that in the movie “Ted”. Someone might think it was done to make people associate this one to the capabilities of the one in the movie which, of course, was just CGI.

It’s not a question of the voice but of an AI being able to maintain a normal conversation as you indicated in the original post. That is simply not going to be possible, certainly not when only “driven by matching dialogue patterns”. The examples I posted are typical of such efforts.

Even IBM’s Watson cannot hold a normal conversation, and the amount of computer hardware to support it is staggering. Both it and Siri are designed to take a single line of input, analyze it to figure out what’s being asked, and look up an answer. This is not a dialogue, much less a conversation.

I find it disturbing that the promotional video was made to look like a child was having a random conversation when I’m sure it was scripted in advance. I feel sorry for any child getting this toy who expects to have a true conversation such as the one in the video and instead gets the kind of responses I showed in my other post.

@nfordkrz you’re right that we should probably distinguish between “normal conversation” and “fluent conversation”, the first one being truly AI complete, the second one still being a novelty, but kids _love_ novelties.
Astonishingly even adults often seem to extract great joy from seemingly flat conversations, presumably _because_ of the often nonsensical random responses. And we have seen kids and adults get really excited once they realize that they are not just talking to a call&response Eliza/Siri, but to a system which already has _some_ unexpected deepness built in. We are not talking about a pletora of jokes and puns, but about a huge amount of AI experiments built-in, which can lead to hours of discovery.
Just try to approach it playfully, not expecting the Singularity to happen just yet.

I think it would have been far better to show an honest video showing the Supertoy in a “real” environment and having the kind of conversation that you are saying is possible. I thought (and I feel a little stupid now) that the video was a true representation of what the teddy would do.
I still do not know what the real experience would be like but you are now saying to approach it playfully and to expect nonsensical responses. Therefore it is not what you have portrayed in your campaign and a 10″ bear phone holder that plays a phone app and makes ridiculous comments to questions is not worth what you are asking and will create many unhappy customers.
When I asked you the size of the bear, you replied 25cm but as you did not have a pototype left, you were not sure. You then came back and said it may be closer to 35cm. I am getting all the wrong vibes now… a creator of a toy such as this would know the dimensions off by heart. It all seems a little tarnished right now.
Sorry

Sarah, What’s wrong with you? Don’t you realize that Teddy is not a general AI and can’t converse with physicists or chemists? He won’t even sound 100% human and of course, his speech, conversation, intonation and fluency will never, ever improve. We all know that introductory objects like transponders, computers, phones and TVs remain the same.