Phone app lets the blind see through the crowd's eyes

Yasmina, a student at the University of Rochester in New York, is in the mood for some soup.

She opens her cupboard where she knows the coconut milk she needs is sitting on the shelf amongst other canned goods. Instead of reaching for the right can, she hesitates. Yasmina is blind. She holds her iPhone to the open cupboard, snaps a picture of the cans, makes an audio recording of her question - "which one is the coconut milk?" - and double taps to send off her query.

Approximately 45 seconds later her iPhone replies in an electronic timbre: "The answer is the one on the right." "Great," Yasmina says, feeling for the rightmost can, "that's awesome."

Yasmina just used VizWiz, a new mobile phone application that provides the visually impaired with nearly real-time solutions to everyday problems. VizWiz can, for example, help the blind read their mail, coordinate their outfits, understand menus in restaurants, check expiration dates and interpret street signs. The app owes its swiftness and accuracy to a marriage of computer chips and good old-fashioned human brainpower.

Designing a computer program that can reliably recognize text and distinguish objects in the real world has proven to be a massive challenge for artificial intelligence researchers. To get around this, the researchers behind VizWiz - a team consisting of computer scientists from several universities, including the University
of Rochester - decided to outsource the task of problem-solving to people: specifically, to Amazon Mechanical Turk's masses of online workers.

To make sure users get answers as quickly as possible, the researchers
programmed an intelligent queuing system they call Quik Turkit to speed
things up. Quik Turkit recruits Mechanical Turk workers even as a VizWiz
user is taking a picture, so someone is always ready to answer an
incoming query.

Eleven blind iPhone users tested out VizWiz, asking
questions like: "What denomination is this bill?", "Do you see picnic
tables across the parking lot?", and "What temperature is my oven set
to?"

They received an average of three responses per query and
waited an average of 133.3 seconds for the first answer. The first
answer received was accurate or helpful in 71 of 82 cases. By the third
answer, all questions were correctly answered.

In a second test,
the volunteers got to use VizWiz 2.0, which includes improved image
processing techniques. Their response time was cut to an average of 27
seconds.

Most of the volunteers were excited about VizWiz and
said they would pay for the service. VizWiz could be "very useful," said
one participant, "because I get so frustrated when I need sighted help
and no one is there."

3 Comments

Wonderful "innovation of care" taking place here. This could be connected to a variety of communities with "caring eyes" such as senior citizens who would like to have the assignments of participating a few hours a day.

Steve
on May 12, 2011 11:09 AM

Why not just read the bar code on the soup can with one of the many bar code reading apps available?

Im not gonna lie, Im actually impressed. Its rare for me to find something on the net thats as entertaining and intriguing as what youve got here. Your page is sweet, your graphics are wonderful, and whats more, you use videos which are relevant to what youre saying. Youre undoubtedly 1 in a million, man!