Site Mobile Navigation

Microsoft Seeks an Edge in Analyzing Big Data

SEATTLE — Eric Horvitz joined Microsoft Research 20 years ago with a medical degree, a Ph.D. in computer science and no plans to stay. “I thought I’d be here six months,” he said.

He remained at M.S.R., as Microsoft’s advanced research arm is known, for the fast computers and the chance to work with a growing team of big brains interested in cutting-edge research. His goal was to build predictive software that could get continually smarter.

In a few months, Mr. Horvitz, 54, may get his long-awaited payoff: the advanced computing technologies he has spent decades working on are being incorporated into numerous Microsoft products.

Next year’s version of the Excel spreadsheet program, part of the Office suite of software, will be able to comb very large amounts of data. For example, it could scan 12 million Twitter posts and create charts to show which Oscar nominee was getting the most buzz.

A new version of Outlook, the e-mail program, is being tested that employs Mr. Horvitz’s machine-learning specialty to review users’ e-mail habits. It could be able to suggest whether a user wants to read each message that comes in.

Elsewhere, Microsoft’s machine-learning software will crawl internal corporate computer systems much the way the company’s Bing search engine crawls the Internet looking for Web sites and the links among them. The idea is to predict which software applications are most likely to fail when seemingly unrelated programs are tweaked.

Photo

Eric Horvitz, above, distinguished scientist at Microsoft Research.Credit
Kevin P. Casey for The New York Times

If its new products work as advertised, Microsoft will find itself in a position it has not occupied for the last few years: relevant to where technology is going.

While researchers at M.S.R. helped develop Bing to compete with Google, the unit was widely viewed as a pretty playground where Bill Gates had indulged his flights of fancy. Now, it is beginning to put Microsoft close to the center of a number of new businesses, like algorithm stores and speech recognition services. “We have more data in many ways than Google,” said Qi Lu, who oversees search, online advertising and the MSN portal at Microsoft.

M.S.R. owes its increased prominence as much to the transformation of the computing industry as to its own hard work. The explosion of data from sensors, connected devices and powerful cloud computing centers has created the Big Data industry. Computers are needed to find patterns in the mountains of data produced each day.

“Everything in the world is generating data,” said David Smith, a senior analyst with Gartner, a technology research firm. “Microsoft has so many points of presence, with Windows, Internet Explorer, Skype, Bing and other things, that they could do a lot. Analyzing vast amounts of data could be a big business for them.”

Microsoft is hardly alone among old-line tech companies in injecting Big Data into its products. Later this year, Hewlett-Packard will showcase printers that connect to the Internet and store documents, which can later be searched for new information. I.B.M. has hired more than 400 mathematicians and statisticians to augment its software and consulting. Oracle and SAP, two of the largest suppliers of software to businesses, have their own machine-learning efforts.

In the long term, Microsoft hopes to combine even more machine learning with its cloud computing system, called Azure, to rent out data sets and algorithms so businesses can build their own prediction engines. The hope is that Microsoft may eventually sell services created by software, in addition to the software itself.

Photo

Qi Lu, president of Microsoft’s Online Services Division.Credit
Stuart Isett for The New York Times

“Azure is a real threat to Amazon Web Services, Google and other cloud companies because of its installed base,” said Anthony Goldbloom, the founder of Kaggle, a predictive analytics company. “They have data from places like Bing and Xbox, and in Excel they have the world’s most widely used analysis software.”

Like other giants, Microsoft also has something that start-ups like Kaggle do not: immense amounts of money — $67 billion in cash and short-term investments at the end of the last quarter — and the ability to work for 10 years, or even 20, on a big project.

It has been a long trip for Microsoft researchers. M.S.R. employs 850 Ph.D.’s in 13 labs around the world. They work in more than 55 areas of computing, including algorithm theory, cryptography and computational biology.

An error has occurred. Please try again later.

You are already subscribed to this email.

Machine learning involves computers deriving meaning and making predictions from things like language, intentions and behavior. When search engines like Google or Bing offer “did you mean?” alternatives to a misspelled query, they are employing machine learning. Mr. Horvitz, now a distinguished scientist at M.S.R., uses machine learning to analyze 25,000 variables and predict hospital patients’ readmission risk. He has also used it to deduce the likelihood of traffic jams on a holiday when rain is expected.

Mr. Horvitz started making prototypes of the Outlook assistant about 15 years ago. He keeps digital records of every e-mail, appointment and phone call so the software can learn when his meetings might run long, or which message he should answer first.

“Major shifts depend on incremental changes,” he said.

At a retreat in March, 100 top Microsoft executives were told to think of new ways that machine learning could be used in their businesses.

“It’s exciting when the sales and marketing divisions start pulling harder than we can deliver,” Mr. Horvitz said. “Magic in the first go-round becomes expectation in the next.”

A version of this article appears in print on October 30, 2012, on Page B2 of the New York edition with the headline: Analyzing Big Data Is Returning an Edge to Microsoft. Order Reprints|Today's Paper|Subscribe