Qbo May Be Short, but Its Head Is in the Cloud

For robot entrepreneur Francisco Paz, a Campus Party held last July was all about pineapples, penguins, and the cloud.

At the week-long technology fest in Valencia, Spain, Paz mounted the stage with his company’s flagship creation, the Qbo robot. Before a large audience, he showed how Qbo was able to leverage the cloud to achieve greater intelligence.

First, Paz showed the robot a photo of a pineapple and asked, “What is this?” The robot scanned the photo for a few moments and replied, “I think I got it. I think it is a pineapple.”

Then Paz video-Skyped an associate with a second Qbo at his labs in Madrid. Paz and the engineer showed both robots photos of a Linux Tux penguin. Neither had seen the picture before. They could not identify it.

While Paz waited, the engineer in Madrid told his Qbo, “This is a penguin.” The robot inspected the picture for a few moments, then said it was ready to recognize a penguin. The engineer then told his Qbo to upload the object to the cloud.

In Valencia, Paz asked his Qbo to download new objects. When Paz showed his Qbo the picture again, it recognized it as a penguin.

The demonstration highlighted how the cloud will transform any individual piece of information into a knowledge upgrade that all devices joined to the network are able to share. What works for object recognition will also work for language recognition, speech synthesis, simulated conversations, and ultimately artificial intelligence (AI).

AI is on Paz’s mind as his company, Thecorpora, gets ready to launch its Qbo robots in the months ahead. “The main goal is to improve the artificial intelligence that exists in robots right now,” Paz says.

Cloud-based robots are beginning to make news. James Kuffner, a former Carnegie Mellon professor who jumped to Google to help develop its autonomous car (which relied on cloud-based Google Maps for navigation), is perhaps the best-known proponent of using the cloud to create a kind of collective knowledge base for connected devices. This is similar to how Siri, the personal assistant app on Apple’s latest iPhone, draws upon information contributed by both people and devices.

Kuffner believes robots are especially suited to using the cloud to store information and also to do such computationally intensive processing as object recognition.

Moreover, he was among the first to foresee how cloud computing could support app stores that would give robots new capabilities. (See the Robotics Trends article, “Robotics App Store Launched”)

Paz is thinking along those lines too. And he plans to make a cloud-connected robot that costs under $2,000.

Entry-level Qbo

Right out of the box, Qbo will provide some nifty AI functionality.

With a compact black and white body, no arms, and wheels instead of legs, Qbo looks a bit like a penguin. Its moveable head with two large webcam eyes (and eyelids) completes the illusion. The lack of arms and legs greatly simplifies design and slashes costs.

The robot stands 18 inches tall and 12 inches round, and weighs 20 to 24 pounds. One servo drives two fixed rear wheels. A second drives the front freewheel, which directs its motion. Like a Roomba, it rolls to its docking station to recharge.

What really separates Qbo from a simple motorized toy are its sensors. To complement the robot’s two webcam eyes, which support stereoscopic vision and 3D object recognition, the head has one unidirectional and two omnidirectional microphones. The body has four ultrasonic and three infrared sensors, as well as a Sharp triangulating infrared rangefinder. Connectivity is through Wi-Fi and Bluetooth.

Qbo uses its sensors to orient itself by a process known as simultaneous localization and mapping (SLAM). After mapping an environment, the robot uses its laser rangefinder, webcams, and a magnetic encoder that tracks wheel movement to pinpoint exactly where it is at any given moment.

According to Paz, Qbo’s base set of AI features enables it to interact physically with the world. “It can do interesting and useful things,” he says.

For example, functioning as a kind of personal assistant, Qbo can retrieve and read emails. It plays music, and recognizes hand signals commanding it to select songs and change the volume. It looks up articles on Wikipedia and reads user-selected parts aloud. It recognizes faces and objects, and even conducts seemingly intelligent conversations.

It also provides telepresence. Users can call Qbo on their smartphone and see and hear what the robot sees and hears. “You can move around your house to check something or have a videoconference with someone,” says Arturo Bajuelos, a Thecorpora engineer.

“We believe these base applications will interest people in buying a robot and in starting a community that we believe will be very big,” he adds.

Community

Qbo will leverage its user community to evolve and grow, Bajuelos explains. “This is the concept we want to bring to robots. They are not static machines, but something that is evolving,” he states.

Thecorpora expects that 95 percent of Qbo users will not care about Qbo’s internal workings. They will interact with Qbo naturally, asking questions and receiving verbal replies. And every object, word, and gesture they teach their Qbo will become part of the cloud-based library for all Qbos.

The remaining 5 percent of users will want to poke around. Thecorpora makes this easy by using standard hardware and open source software for many of its core functions. “You can change a board, a camera, or anything else. It’s a very low-cost platform, and you can upgrade it the way you like,” Bajuelos says.

The software side is even more flexible. Active users will be able to access Qbo’s databases to improve its information processing ability. For example, users might find ways to develop descriptors and algorithms to recognize faces or common household goods.

“This is the real power of the community,” Paz states. “A lot of people are capable of designing more sophisticated algorithms,” which the robot can then utilize.

Meanwhile, Thecorpora engineers plan to use an AI technique known as machine learning to improve its object and face recognition algorithms automatically. The software will automatically look for similarities and differences between different types of images in order to find better (and often less obvious) ways for Qbo to identify them.

“What we have now is good enough to distinguish a pineapple or a penguin, but that is not close to what is possible. We’re open to developing more sophisticated algorithms, and then downloading them into Qbo,” Bajuelos says.

Because of Qbo’s relatively low cost-under $2,000, compared with tens or even hundreds of thousands of dollars for similarly intelligent models-Paz believes his community will grow to thousands of users. Such a large community will rapidly expand the robot’s knowledge and draw algorithm and application developers.

“Every day, Qbo will become more and more intelligent,” Paz says.

Open source

To achieve that level of intelligence, Qbo relies on several open source platforms. The underlying operating system is a custom distribution of Linux variation Ubuntu. It uses Willow Garage’s ROS (Robot Operating System) tools and libraries for actual robot applications.

For natural language processing the robot uses Julius, a Japanese open source large vocabulary continuous speech recognition (LVCSR) decoder. Julius works by using a machine learning system to create statistical representations of word sounds (phonemes). The more sound files it can draw from, the more accurate it becomes. Its latest release decodes dictation using a 60,000-word vocabulary at almost real-time speeds

Unfortunately, that’s only in Japanese. Qbo accurately translates a limited English vocabulary. “In Japanese, they have a perfect acoustic model. It took about 2,000 hours of audio. In English, we have about 50 hours. If our community is big enough, we can create a better model very quickly,” Bajuelos says.

Object and face recognition use OpenCV, a library of real-time vision programming functions created by Intel and now part of Willow Garage’s ROS. The visual equivalent of Julius, it extracts descriptors from objects and uses machine learning to classify them.

Qbo looks at objects as a human baby might, moving and tilting its head to see different orientations. It uses its stereoscopic vision to remove the background, then tries to classify the object. The same algorithm recognizes hand signals used to control the music player.

Facial recognition is more difficult. Qbo often confuses a new person with an image in its database. “There are many ways to improve this,” Bajuelos says. He points to Google’s Picasa and Facebook’s face recognition program as examples, but neither are open source.

Down to business

Paz used the money he earned from a previous venture to finance Thecorpora, which he owns jointly with his brother. The company received a zero-interest loan to bring Qbo to production. “We are in the final stages of commercialization. We have finished the base applications that will come with Qbo, we’ve received the robot molds, and we are applying for CE and FCC certification,” Bajuelos says.

Although Thecorpora has no formal partnerships with any company, Paz has been keeping Willow Garage updated on progress and plans a visit there soon.

He intends to sell his robots on his website and several big Internet shops around the world (he cannot reveal which ones). He does not yet have a specific target market in mind. “No one has done this before. We read a lot of robot blogs every day, and we haven’t found anything about a robot similar to ours,” Bajuelos says.

“There are two types of robots: very expensive and technically evolved, and cheap robots that cannot evolve and are very limited. We’re trying to build something in the middle. They are relatively cheap, but have the potential to evolve into superior artificial beings,” Paz adds.

“There are lots of people interested in having a robot, but they don’t know how to work with them. They dream of having a robot that does something in the house, their own R2D2. Qbo can be the first. It really does something, and it really appears intelligent,” Paz concludes.