Wednesday, 23 July 2008

Will Google ever really know me?

As an experiment I used Google Translate to translate this blog to Swedish yesterday. The end result was acceptable, however there were some interesting misinterpretations - notably the name of it was changed to 'Smeknamn Burcher' - literally meaning 'nickname Burcher'.

www.nickburcher.com in Swedish!

This is the ongoing problem for applications and Search Engines. Algorithms and machines can recognise patterns, words and data, but they cannot (yet) properly intuit context. Everything is based on statistical probability and in the main they get things right, but not always.

The main Search Engines ignore words like 'and' and 'the', whereas Semantic Search advocates argue that these are vital to being able to interpret context. Semantic Search engines like Powerset and Hakia are using ‘Natural Language Processing’ technology to reference queries against huge language databases to help establish correlations and context – thus producing more relevant results for the user. This requires significant processing, but advances in processing power are enabling Semantic Search querying to become easier and more important – as evidenced by Microsoft buying Powerset.

The quest for greater relevance is not just limited to text results though. Google (and others) are working hard to develop Universal Search, where the Search Engine aims to serve up the most relevant result for the user in the format that is most appropriate (eg a search for ‘Britney Spears’ serves up a mixture of photos, video and text results on the same page.)

Britney Spears Google results showing Universal Search in action

There also seems to be a proliferation of Amazon ‘other people who bought this liked’ models. Digg have introduced a recommendation engine, Zemanta offers a nice tool for bloggers (offering relevant content as posts are typed) and news articles are routinely tagged with related story features.

As the volume and variety of content online grows, filtering is becoming increasingly efficient and recommendation is becoming more important. As the internet continues to develop the mechanics around the two different types of searching will become more distinct. Voyages of recovery will continue to be based around Search Engines, with results becoming even faster and more intelligent. At the same time the starting point options for a ‘voyage of discovery’ are diversifying and are not limited to Search Engines – the volume of users starting their journey through a Search Engine could decrease as social networks, social search and recommendation engines become more prominent (hence the continued rumours around Google buying Digg.)

This presents challenges to advertisers and Search strategies will need to increase in complexity to properly harness the new opportunities that are developing. However, if improvements in underlying machine intelligence mean that ‘nick burcher’ can be automatically (and accurately) translated into any language, then there will be a whole range of other things to think about too!

2 comments:

Hi Nick. I think there will always be room for error as regards to search engine relevancy as they will always be based, at the end of the day, on a set of rules, and it's this that the spammers seek to exploit. Unless they can build a search engine, or a computer even, that is capable of free thought and judgement, they'll never be 100% fallable.Great blog BTW. Subscribed.

Thanks for your nice comments, they open up another area in this debate too. Currently the whole SEO industry is based around getting your website to rank highly against certain queries. This works from the premise that everyone gets the same results against the same query derived from a standard set of rules.

As iGoogle, personalised search and search linked to your social profile develops, then Search Engine results will get more and more specific to you - improving relevancy and context.

In this scenario it seems like it will get harder and harder to apply blanket SEO techniques to improve rankings as everyones results will contain variations.

I don't think Search Engines will become free thinking soon, but I think we are gradually heading towards that point. The more information Google find out about its users, the faster they'll get there - and I think this will make life more and more complicated for SEOs!