Web searching can be a lot like hiking. Sometimes it’s a pristine path along the coast with stunning views over the channel, other times it’s a bracken-laden jungle where you can’t even see your feet. I’ve been on both kinds of paths, and I much prefer the former. [Read more…]

The Washington Post ran an article on Jan 30th by Michael Rosenwald highlighting the increasing amount of spam on Google. Spammers have figured out a way to cheat the Google system and are now bringing it direct to you and me. It’s clogging up the works and many are starting to worry that Google isn’t taking the issue seriously enough.

How the spammers do it

1. Content farms – businesses created specifically to generate cheap and filling answers to popular search strings – are increasingly padding out the results.

According to Rosenwald, websites like eHow hire freelancers to write how-to articles on a wide variety of topics. The sites work hard to optimize the page content so that they get pushed to the top of search results, and because we trust Google’s algorithms to give us good hits, we click the link. And when we click the link, we’ve reinforced to Google that that link is what we’re looking for – Google ranks web pages in part based on how many click-throughs it gets. Once we actually see the page, we realize it’s crap, but now we’re there, we’ve committed. We’ve clicked. Crap.

2. Nickel-a-clickers – People that are paid to click links to bring a web page higher in the rankings.

Through employment matchup services like Amazon’s Mechanical Turk, spammers hire cheap labor to click on links in order to make their site seem more relevant. Every time the clickers click they get five cents, or whatever the agreed-on rate is. In order to earn a decent return, they have to click a lot of hotlinks. I can’t imagine the boredom that entails, but I suppose it’s pretty easy work.

I’d never heard of Mechanical Turk, so I thought I’d have a look-see. One of the jobs on offer: “CopyEditing and Logically Filling up of Blanks for Recipe Database.” Job description: “Check for grammer errors.” Oh good. Final comment on the job from the employer: “It doesn’t have to be factually correct. As long as the details seems plausible and logical it is fine.” Well, there’s another reason for sticking with reliable ol’ Epicurious.

Big deal, there’s more spam in Google results. Whatever.

You might say that now, but if spammers are enlisting armies of cheap labor to scam the system, Google’s in big trouble. Because if we all get fed up, there are several other big engines ready and waiting for the influx of search émigrés. And Google will be another name like Netscape or Northern Light that you think “Oh yeah! I used to use that all the time!”

And it means trouble for us, too, because some of the pages you click through to are going have more and more distasteful things on them. And like bedbugs they could start creeping onto your hard drive and lurk there.

What lurks beneath the surface?

Lurking things? Eww. Are there options?

Alternatives to Google include Bing, Blekko, and Exalead, just to name a few. Here at HBG, we check more than one search engine for every search we do and you might want to consider making it a habit too, if you aren’t already.

Here’s another reason why you might want to use more than one engine: studies in recent years by researchers at Penn State, the University of Kashmir, metasearch engine Dogpile and others have shown repeatedly that information overlap between search engines is very low, sometimes as little as 1% in the first page of returned results. You get a stronger variety of results if you cast your net wider.

[An interesting sidenote here: search guru Danny Sullivan posted an article on the Search Engine Land website yesterday titled “Google: Bing Is Cheating, Copying Our Search Results.” Apparently Google set up a sting operation recently to prove that Bing has been lurking over their shoulders and copying search results. Sullivan’s article has pictorial evidence and everything. Will Bing just clone Google’s results, spam and all? I sure hope not.]

And finally, consider this: when you search Google or any search engine, you’re looking at a static database of web pages that were scanned in days, weeks or perhaps even months ago. Which is why sometimes when you click a link you won’t find the word you were looking for. The page was updated in the interim between when it was saved in the search engine’s database and when you clicked the link. One search engine may have cataloged a site yesterday and another last month. Or last year. For freshness, reliability and completeness, it just pays to use more than one search engine.

Now don’t get me wrong, I love Google. But this spam thing is starting to bug me. Are you concerned? Or is this much ado about nothing?