How Google is changing language – and how the Telegraph lost its soul

The dominance of Google is radically changing written language on the internet – through their search engine and advertising programmes such as AdSense they are homogenising the meanings of words. This provides a strong impetus for newspapers to ignore whatever editorial ethics they had left in their desperate rush towards the money from online advertising.

Google was started by two Stanford students, Sergey Brin and Larry Page, who shared a common interest in retrieving relevant information from large data sets – their first co-authored paper was called The Anatomy of a Large-Scale Hypertextual Web Search Engine (PDF, 124Kb). Google now considers its mission “to organise the world’s information and make it universally accessible and useful.” Note the change from a passive relationship (“searching”) to an active relationship (“organising”) with the content of websites. They are also very much “in the advertising business”. (Both quotes are from The Search Party, an interesting article about Google by Ken Auletta in the New Yorker).

Google Adsense works by taking the text content of the page, analysing the discrete blocks of text that we see as “words” and “sentences”, and using them as a way of serving up “relevant” advertisements. (This blog has them at the bottom right of each page). The programming techniques that make this happen (“algorithms”), and the secrecy surrounding them, are what has made Google an enormously successful company – their ability to take the text that someone enters and produce adverts and links to sites that they feel like clicking.

But what Google does has nothing to do with language, except in a very dissipated form. Rather than using language as you or I would understand it, Google’s algorithms are simply a form of pattern matching. Search for “car”, and Google will give you results for “car” and “cars”, but not “automobiles”. The inability of Google to understand the meaning of words can produce some strange results. (There are exceptions – searches for medical terms such as “alzheimers” for example, that offer other searches for “symptoms” and “treatment” – but these have to be flagged manually by Google).

Another part of the puzzle is that Google works to a large extent on how people link to other sites. If a website has a large number of links pointing to it with the words “car auctions”, it’s much more likely to turn up on a Google search for “car auctions”. (Although if the site being linked to was actually about My Little Pony and called www.mylittlepony.com, that effect wouldn’t be very strong for obvious reasons – the links, to Google, are a measure of a site’s popularity; but unless the links and the site’s content match up, the effect is reduced a great deal.)

In this crude way, Google starts to understand what most people mean by “car auctions”, and places sites in its results accordingly.

A good example of this effect was what happened to our Google Adsense adverts when Lelyn posted his blog Our Man In Cairo – Bright & Hot about his first impressions of Cairo. It seemed likely that Google Adsense would pick up on the references to Cairo and offer holidays to the Red Sea and Sharm el Sheikh. But looking down the list of ads when the page first went live, most of them were adverts like “Meet Sexy Arab Women – Thousands Sexy Women Online Join Free!”

The main words that Adsense is picking up on in Lelyn’s blog are obviously “Arabic” and “Hot”. Put together in that way, combined with the knowledge of the way that Google works and how it ascribes meaning to language, it becomes more obvious how adverts like that appear, and the way that the content of websites determines the way that Google sees the web.

This isn’t just a one-way process however – the effect also feeds back into the way that content is written for the web.

The death of printed newspapers, reported with much hand-wringing by the press, is seemingly not far away. Whether that turns out to be true or not, the sense of panic in Fleet Street is palpable as newspapers fall over each other in the rush for the internet. However, one of the problems with the internet is that it doesn’t pay well. The most common source of revenue on the internet is online advertising – but one million visitors to a site don’t generate nearly as much revenue as one million readers who both pay for the paper and have to sift through the advertisements in it. As documented with unflinching and refreshing clarity in Flat Earth News by Nick Davies (previously featured on this blog in John Lanchester, Riots, terrorism etc.), the decline in print readership has lead to precipitous declines in newspapers’ incomes, which were hardly secure to begin with.

This, combined with website visitors’ increasing ability to shut out adverts when they read sites, means that even the most popular newspaper websites struggle to produce a significant amount of income. Further to which every visitor to the site, whether they click on an advertising link or not, costs the newspaper company money for the bandwidth that they use by requesting pages and their graphics.

For a long time the Guardian website, Guardian Unlimited (now simply guardian.co.uk), seemed to be leading the field. Whether it actually made any money or not is a moot point. But recently, with enormous fanfare, the Telegraph website telegraph.co.ukrocketed to the top of the list of newspaper sites with the most “unique monthly visitors”. The editor of telegraph.co.uk appeared throughout the media, positively gloating over the online triumph of what is seen as one of Britain’s “quality” newspapers. Rather timid questions were asked about lies, damned lies and statistics, but were robustly brushed off by the Telegraph, and indeed it seemed that the site had actually had a vast increase in visits. Of course the Telegraph’s line in all this was the triumph of quality content in a medium full of trivia – they attributed it to “a string of major news stories – particularly around the credit crunch – and in depth coverage of the Budget, for which it built a micro-site and commissioned exclusive videos on Telegraph TV”.

What ties these two threads together – the way that Google works and telegraph.co.uk’s huge increase in visitors – is provided by this story in the 11th-24th July 2008 edition of Private Eye, which is short enough to quote in full but very enlightening nonetheless:

What is the secret of the Telegraph’s online success, which has propelled it to the top of the pops in Fleet Street in terms of the number of “hits”?

First, news hacks are now sent a memo three or four times a day from the website boffins listing the top subjects being searched in the last few hours on Google. They are then expected to write stories accordingly and / or get as many of those key words into the first part of their story. Hence, if the top stories being Googled are “Britney Spears” and “breast cancer” – hey presto, the hack is duly obliged to file piece about young women “such as Britney Spears” being at risk from breast cancer.

The second new development is to run as many downmarket and sensationalist stories as possible – to the horror of old Telegraph hands (or at least the few who are left) and readers. Since the young guns manning the website neither know nor care what the Telegraph stands for, they bung in whatever grabs their Heat-reading fancy.

Thus a story appeared the other day about the woman with the world’s largest breasts – plus picture – and a man with a rare disease who was “turning into a tree”, again with pics. After complaints from female members of staff, the megaboobs item was eventually taken down – but not before it had earned plenty more “hits” from salivating web-surfers whose tastes are clearly rather different from those of Sir Herbert Gusset.

Each person that visits a site is called a “unique visitor”. But the value of these “visits” differs widely. One visit is counted every time a person visits a website, whether they:

Click a link for telegraph.co.uk from Google, decide the page isn’t relevant without having to read much of it and leave straight away or

Are a regular visitor, come to the site using a bookmark in their browser, and spend an hour reading numerous news stories from beginning to end or

Anywhere in between those two.

(The LoveHowlMuse blog gets a few visitors of the first kind too – mainly people searching for “show me your cunt”, who end up on the blog Selfish Cunt opens for Motorhead – Show me your fucking money). It seems likely that the vast majority of the new visitors to telegraph.co.uk are of the first kind, given the kind of Darwinian survival-at-all-costs tactics that they’ve started to use. They probably don’t tell that to the people who are the targets of the publicity drive, though. Not the readers so much as the people who they really get their money from – companies who place online advertisements.