The similarity algorithm also appears to have changed, with noticeably better quality in the spot checks I did.

This is a fairly big deal. Similarities based aggregate behavior and targeted to the immediate context is a big step toward personalization. The next step is to tie the data to individual history and target similarities and recommendations both to the context and each person's past history.

Google's latest move should let Google add fine-grained personalization based on current missions and short-term trends, which, in combination with their current search personalization, is likely to improve Google's ability to help people find what they need.

Update: Here is the announcement of the new feature on the Official Google Blog.

4 comments:

Do you by know about the algorithm implemented behind this feature? Any published papers or patents?I have been working on a paper which does nearly the same this for Semantic Web/Networks which i presented in International Semantic Web Conference. ( http://data.semanticweb.org/conference/iswc/2009/paper/poster_demo/130 )

Greg - I get that this sort of similarity calculation is pretty neat. But where does personalization come in? If I as a user want to find the most similar site to Engadget, do/would/should I somehow believe that Wired is more similar, to me personally, and that Gizmodo is less similar? Or vice versa?

This is a conceptual question, not a technical question. I'm sure we could mine user logs and come up with all sorts of personalized page recommendations for users. But similarity? Is that the right word/concept in the context of personalization? Is (for example) Gizmodo really less similar to Engadget for me, and more similar to Engadget for you?

Google operating system has a bit more info on the algorithm, though it may have improved: http://googlesystem.blogspot.com/2007/07/finding-related-web-pages.html

Simply put, this is “sites that linked to this also linked to”. Which is good for finding similar content (authorities), though not similar recommenders. (hubs — recommenders would use “sites that are linked from this are also linked from”, and would have to avoid duplicates).