Wednesday, March 02, 2005

Personalizing search using your desktop files

Todd Bishop reports on Microsoft's TechFest, including a project on personalized search:

Projects on display during a Microsoft Research event yesterday included a method for personalizing Web search results ... The prototype developed by the Microsoft researchers comes up with those personal preferences automatically by consulting the index generated by MSN Desktop Search.

"Other people have tried to do this by requiring you to specify a profile -- so you say, 'I'm interested in technology or sports,' or whatever the case may be," said Susan Dumais, a Microsoft senior researcher working on the project. "The nice thing about using the desktop search index is that it captures all of that ... and it's updated continuously."

Microsoft senior researcher Eric Horvitz ... called the personalized search technology his top priority for transfer from Microsoft's research division to the product development side of the company.

Clever. The idea is to use the files on your PC to build a profile implicitly and then use that to modify your search results.

It sounds like this is a coarse-grained approach, building something like a general subject and keyword profile that then skews all future searches. Coarse-grained approaches are easier to implement, but they can make inappropriate changes -- How do you use my interest in Cooking to bias a search for "personalization"? -- and don't do a great job at discovery, surfacing really interesting little gems I would have otherwise missed.

Susan's dig at "other people" is probably referring to Google's personalized search, which is also coarse-grained but does require you to explicitly specify your interests.

Building a profile implicitly is nice, but it's not clear to me that my desktop files are a good predictor for personalized web search. Are the files on your desktop correlated with what web search results you find most interesting? I'm not sure they are. A better predictor might be previous searches or the web pages you've viewed, not whatever data you have stored in Word and Excel files.

But it's a clever idea. Excellent to see Microsoft Research pushing personalization.

Update: According to a recent SIGIR 2005 paper that appears to be about the same work, it sounds like this personalized search prototype is using keyword-based approach, not subject-based approach.