Artificial intelligence seeks natural interfaces

SEATTLE  Artificial intelligence was the "unrecognized" driver in a variety of Microsoft Corp. research slated for future products, Bill Gates revealed this week.

Microsoft's founder and chief software architect demonstrated prototypes of the company's artificial intelligence (AI) products during his keynote speech at the International Joint Conference on Artificial Intelligence (IJCAI), held this week in Seattle.

"Often what begins as artificial intelligence research is not recognized as AI after it has been integrated into a software product . . . text-to-speech is an example," said Gates.

The AI Gates demonstrated at IJCAI was of this "unrecognized" variety. That is, the AI was integrated into prototypes that solved real user problems, and thus will be judged by users for their problem-solving ability, rather than as AI.

For example, one demo showed a Web cam watching its own user so that it could, for instance, silence incoming e-mail "dings" when the user is in a face-to-face conversation. Even though AI is used to recognize a face-to-face conversation, the user will perceive it as a product that keeps the e-mail alert from interrupting conversations, and not as an AI product.

"AI is helping us create more natural user interfaces . . . we need future software to listen, see, reason and understand the user's context, intentions and goals," said Gates.

Consequently, the theme of personalization-through-automatic-preference selection was ubiquitous among Gates' demos of future user interfaces. Speech recognition and natural-language processing were prominent too, along with visual object recognition, machine learning and automatic reasoning algorithms. The applications ranged from smart Web searching and sorting e-mail, to data mining and "continuous computation."

One of the most interesting demos was of Mind  a multimodal speech recognition interface running on a handheld PC wirelessly connected to a central server. According to Gates, the reason speech recognition is of limited utility is because it only substitutes speech for typing. Solving the user's problem, however, involves many more mouse clicks and menu selections than typed characters.

Speech demo

For instance, if the user wants to send e-mail to Bill Gates, he must click around to open the e-mail program, then click around to get a new message document, then click around to get Bill's e-mail out of the address book, then click to put the cursor into the "body" field and finally type the text. A speech-to-text recognition engine only helps with the last step, and even if you also have a speech-to-command engine to speak commands such as "open e-mail program," today it's just faster to click the mouse.

However, Microsoft's multimodal speech interface demo showed how it could be easier to speak than to type by switching contexts to the correct program and menu items automatically.

For instance, speaking "send Bill e-mail" caused the user interface to automatically open the e-mail program, look up Bill's address and fill it in. It also automatically jumped the cursor to the correct field when speaking the "body" of the message, and it switched the context to other programs, like the spreadsheet, when asked something relevant.

The intention of the user, derived from key elements of what the user does, such as say "send e-mail to," also showed up in other demos shown by Gates and the team at Microsoft Research.

The company achieved continuous computing, as it is called by Microsoft Research member Eric Horvitz, by utilizing the unused microprocessor cycles when a user is pausing. Continuous computations power the AI that interprets the intention of the user, then automatically personalizes the computing experience by dynamically setting preferences.

For instance, instead of going to your e-mail box to dig for the important items, continuous computing dynamically reorders items in terms of importance to you, depending on both the content of the e-mail, your past choices about what to read first and the current situation.

Other members of Microsoft Research showed AI-inspired database search-and-display systems. They appealed to various degrees of natural adaptation, like the information trees that can be navigated by a "gnat's" eye view shown by manager of the Machine Learning and Applied Statistics Group David Heckerman. And Swish  or search with information structured hierarchically  was demonstrated by senior researcher Susan Dumais of the Adaptive Systems and Interaction Group. Results of searches done with Swish are sorted into naturally occuring hierarchical categories for more effective communication to the user.