Google Cracks Down on Spammers and Scrapers

(Update 4:55 p.m. EST with Google statement)

Spammers and scrapers of the world beware.

In a move that internet content creators have been dreaming about for years, web search giant Google has moved to crack down on spammy and derivative content that has been largely copied from other sources on the web and which somehow manages to bubble higher in results than the original.

Anyone who’s ever written a word on the internet and seen it ripped off and posted elsewhere will appreciate this move.

On the other hand, companies who traffic in low-quality content in the hope that by littering the internet with search-driven mediocrity they’ll generate enough advertising revenue to be a going concern, should be concerned.

“This was a pretty targeted launch: slightly over 2% of queries change in some way, but less than half a percent of search results change enough that someone might really notice,” Google search quality honcho Matt Cutts wrote in on his personal blog, in a follow-up to changes that had already been announced.

“The net effect is that searchers are more likely to see the sites that wrote the original content rather than a site that scraped or copied the original site’s content,” Cutts wrote.

Scrapers have long said they are protected by the “safe harbor” provision of the Digital Millennium Copyright Act’s “fair use” principle, which allows excerpting of others’ content for some uses.

That’s most likely true, but that doesn’t mean the rest of us have to wade through all the derivative junk that is basically copied off others’ websites.

Google did not name specific companies that will be affected, but in the earlier post he made reference to “‘content farms,’ which are sites with shallow or low-quality content.”

It’s up the rest of us to try to divine what “content farms” Cutts was referring to.

In a follow-up email, a Google rep offered more detail, though the company declined to name specific websites that will be affected, as per company policy.

“At Google we are constantly tuning our algorithms to improve the relevance of our results,” the Google statement read. “Sometimes we’re simply doing a better job catching pure webspam with paid links, hidden text, scraped content and other violations of our webmaster guidelines. Other times we are tuning the algorithm to do a better job surfacing relevant, authoritative, interesting, on-topic, original content.”

“We made more than 490 improvements to search last year alone, so this is an ongoing process and a complicated science,” the spokesperson added.