Henry Lieberman: Say hello to smarter apps that fulfil your wishes

This article was taken from the November 2012
issue of Wired magazine. Be the first to read Wired's articles in
print before they're posted online, and get your hands on loads of
additional content by subscribing online.

How many applications do you have on your computer? I've got
159, with 34 on the sidebar. How many apps on your phone? I've got
a few 20-icon screens of apps, and after five or six more, space
will run out. Where will all the apps we haven't thought about yet
go? As with the trend of fossil-fuel consumption, screen-space
consumption is unsustainable.

What's the solution? A desktop is like a toolbox, full of
hammers and screwdrivers. It's up to me to know what tool to use
for the job, use it correctly and put it away. My toolbox shouldn't
have to contain every possible tool for every possible
job. And what happens if something goes wrong?

The alternative is what I call "goal-oriented interfaces". The
interface should be designed around what the user wants to do,
rather than what the computer wants. It should be the
responsibility of the system to figure out how to get the job done.
It should delve into the details only if it's not sure what the
user wants, or if something should fail.

One way to achieve a goal-oriented interface is through the use
of natural-language input. Perfect understanding isn't yet
possible, but things are improving. I'm writing this column using
speech recognition. As the user thinks of more things they
want to do, natural-language interfaces can scale.

Apple's Siri is the first really popular commercial
broad-spectrum natural-language interface. It was preceded by more
than $100 million and a decade of government and academic research
in AI. It represents a tremendous achievement, and we will
certainly see more like it. But, presently, Siri has its
limitations. It is specialised to a small set of potential tasks.
It tries to match what the user says to one of the kinds of tasks
that it knows about, and then calls a conventional phone/web
application relevant to the task. But Siri's expertise stops at the
boundary of the application. Then you're back to the conventional
interface. You can't teach Siri how to do new tasks. Siri cannot
compose applications to do a multi-step job. And Siri doesn't have
much ability to deal with situations where it misinterpreted, or
something goes wrong.

At the MIT Media Lab,
we're working on interfaces that, like Siri, are goal-oriented and
use language. But we're interested in a broad spectrum of user
goals, open-ended and context-sensitive interfaces, and recovery if
things fail. A key ingredient is common sense. A computer will book
you a plane from Boston to London, but it doesn't know you can't
drive there. We're amassing a large common-sense knowledge base,
and using it to figure out what "makes sense" in a situation.

Another key is to bring the power of programming to the end
user. No application developer can make separate apps for
everything a user might want to do. So we're going to have to give
users the power to teach new capabilities to the computer
themselves, without using a programming language or an "app store".
Finally, no computer is going to get it right every single time. So
we have to give users the ability to criticise the computer's
behaviour, and fix it when it doesn't work. Just as a programmer
uses a debugger, we need end-user debugging tools, to make our
systems more resilient. Rather than simply filling up our screen
space until it runs out, we need to start using the renewable
resources of knowledge, language and human ingenuity.