Follow Up on Content Farms and Search Engines

During yesterday’s Farsight 2011 discussion, the topic of content farming was discussed among representatives from Google, Bing and Blekko. The discussion centered around whether content generated in this manner should be considered spam and produced some interesting insights into how engines treat and rank this type of content.

For those of you who aren’t familiar with the practice of content farming, I discussed the practice in a previous post and its essentially a system where programs look at popular search data and generate article topics based on volume and the revenue from advertising. These articles are then claimed by writers who are often compensated at low rates relative to standard SEO article writers.

Since the goal is to create as much content as possible in a short amount of time, writers will often do rewrites of existing content and list them as “sources” when in actuality the entire article is plagiarized with just enough variation to make the new content unique. As a result of the volume of content and popularity of these web properties (ehow.com, livestrong.com, cracked.com) their content often ranks well for searches and in some cases might outrank their “source” article.

As a result users are now pushing back and yesterday Google, Bing and Blekko addressed how they handle content farms:

– Google: They apparently rolled out two algorithm changes in 2010 targeting low quality content but they are apparently testing additional changes that could further change how these kinds of sites are evaluated. The changes would be a document level analysis and I discussed the possible impact of these types of changes on article marketing sites in my last post. Their “crackdown” raises some interesting questions:

1) Is Google prepared to lose Adsense revenue by devaluing Adsense properties?
2) These sites don’t pay well but they do provide income for people who could use extra bucks (moms) … are they considering the secondary economic impacts?
3) What about no-pay sites that accept and host low quality content?
4) What sort of signals are they looking for when they conduct a document analysis? Similarities in the domain and entire web? Readability? Keyword insertion? Author name?

– Bing: Their representative argued that systems that allow content as a commodity (Adsense) are the reason spam exists. I could retort their assertion in a childlike manner but as long as there is an opportunity to monetize content whether its through ads, links, affiliate programs or direct sales, you will have spam.

– Blekko: This was the most interesting response. They said that they have banned certain “content farm” sites from their index altogether as a result of direct user feedback. While Blekko is still a niche search engine, it has a human review element that allows for a certain level of moderation and as a result of users flagging sites, they have removed some of the more popular content farms. One could argue that while this is beneficial from a user experience perspective, this practice lends itself to abuse should someone create a group that specifically targets competitors or political speech that it does not agree with.