Denis Pinsky leads the growth marketing team at Forbes, focused on digital strategy, product development, strategic partnerships, and other initiatives that improve and grow Forbes platform. Prior to Forbes, Denis was responsible for driving strategic direction and ownership of search engine optimization, social media strategy, and web analytics at Enova and AT&T while also serving as an advisor to entrepreneurs and startups at Webfia. He has a Bachelor of Science degree in Marketing and International Business from the Washington University in St. Louis. While at the Washington University, he developed entrepreneurial approach to problem solving working for startups and Fortune 500 companies in London, Hong Kong, and New York.

Panda 4.0: Google's Still Gunning For Low Quality Web Pages

Since May 20th many web sites that Google deems ‘poor’ in quality will have found themselves ranking lower in search results, potentially losing traffic and sales. That was the date the search engine began rolling out Panda 4.0, the fourth major Panda update, which is part of the drive to improve the quality and relevance of web pages it displays in its results.

What has been the impact of this latest change? And how does Google go about judging quality web content – and identifying and punishing sites that don’t meet its standards? And what happens when it gets things wrong?

The latest change to Google’s ‘secret sauce’

Panda 4.0 is the latest change to Google’s algorithm, the ‘secret sauce’ that determines which pages will feature in search results and how highly they rank. And these updates, along with many smaller iterations, are a key part of the search engine’s armory as it seeks to continually improve its ability to provide searchers with the most relevant results, pushing poor quality, spammy content lower down results pages.

The first Panda change was introduced to Google.com in February 2011 and successive Pandas, along with other algorithm updates have hit the search rankings of many sites that do not pass Google’s quality test with devastating effect. The chart below shows the declining visibility in searches for one such site, very clearly illustrating the evolution of Google’s algorithm – with the perpendicular dips indicating the major Panda 2.0 update that led to a steep ranking fall, along with smaller Panda iterations.

Aggregator sites rank lower after Panda 4.0

When we at Searchmetrics analyzed the impact of Panda 4.0, by picking out sites which had lost rankings against a database of millions of popular keywords, it became clear that aggregator web sites – those which aggregate information from other online sources rather than posting their own original content – have been one of the key targets this time. This includes Press portals, News sites (especially the celebrity/ gossip sector which republish stories from news agencies), Price Comparison sites as well as some Forums and Weather portals.

It makes sense that Google should target sites that do not display original content – after all it is trying to present results which are the best answer to searchers’ queries, not those that are already available in numerous locations.

The characteristics of a high quality web page?

But what is Google’s definition of a high quality page and how does the search engine differentiate high from low quality pages, given the millions of pages it has to sift through?

A good answer to the first question can be found in an article that Google itself posted on its own Webmaster Central blog in in 2011 under the headline “What counts as a high-quality site?“. Some key extracts are:

Would you trust the information presented in this article?

Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?

Does this article provide a complete or comprehensive description of the topic?

Does the page provide substantial value when compared to other pages in search results?

Is this the sort of page you’d want to bookmark, share with a friend, or recommend?

So some of the main characteristics that Google feels are important to the quality of a web page are; trust, value, written for searchers (rather than second guessing what might rank well), comprehensively covering a topic and originality. These are elements most of searchers would probably use to define a quality page.

How does Google differentiate high from low quality?

To answer the second question, you have to understand that there are hundreds of factors that Google’s software analyzes when deciding how a page should rank for specific search queries. These range from whether the words on the page match the ‘keywords’ in the search query, the presence of images, whether and how many other web sites have linked to the page, site speed, existence of spelling mistakes etc.

But importantly with regards to rating the quality of a page, Google also analyzes ‘user signals’. In other words it compares how searchers have interacted with a page when it has appeared in search results to assess how well it may have satisfied their needs. These user signals include Click-Through Rate (the proportion of searchers who have clicked on a page when it has appeared in search results), SERP Return Rate (the proportion of searchers going back to the search engine results page after having clicked on a link – which suggests they did not find what they were searching for) and Time on Site (if searchers stay longer it indicates the page is what they were looking for).

So Google’s algorithm automatically positions pages within results based on hundreds of factors including user signals. Updates such as Panda are examples of adjustments to the algorithm that can lead to major shifts in rankings for some sites. And Google is continually testing and improving its algorithm, a process so deeply rooted in its business, there is even a special name for it: The Google Everflux.

When the algorithm gets it wrong

The algorithm is based on software technology so is an objective assessor of quality. But there are instances in which Google believes the algorithm gets it wrong. In other words sites which have infringed its guidelines still rank higher than they should.

For this the search giant employs human quality raters. Google’s Search Quality team, led by Matt Cutts, rates websites ‘by hand’ using the so-called Quality-Rater-Guidelines (which – like the factors used in the algorithm – are top secret).

This team has the power to unleash a Google Penalty, a targeted measure that can lower or wipe out the rankings of certain pages of sites that violate Google’s guidelines (even though they may otherwise rank well and have not been pushed lower by the algorithm).

The advantage of a Google penalty

For an online business the advantage of a Google penalty lies in the fact the search engine generally warns you that you are being punished and what guidelines you are breaching. This gives you the opportunity to take countermeasures and request a Reconsideration Request, which – if successful – results in the resolution of the penalty and a return to the algorithmically computed search result positions (which does not always mean the same position you had before).

For example, the founders of rapgenius.com, a service that lets users look up song lyrics and discuss their meaning, did admit to be affected by a Google penalty, in the course of which the domain lost rankings on nearly all its important keywords. “We messed up”, they admitted, and started working together with Google. Today, they perform better in search results than ever before (see chart).

Can humans objectively judge quality?

In contrast to rapgenius.com, however, there are domains that are never released from their manual Google penalty and forced to survive with lower (or even no) search traffic. And there are examples of domains being penalized and no longer showing up for brand name searches. Something like this happened to the domain of the German automobile manufacturer BMW a while ago and it is hard to argue that this could be beneficial to searchers.

So while many of the search giant’s detractors can accept the objective justice of algorithm updates such as Panda 4.0, they sometimes point the finger accusingly when it comes to manual Google penalties. They say: algorithms are objective by nature. But humans are not and can’t be relied on to objectively judge the quality and relevance of pages. And how can Google justify overruling its own automated, highly sensitive algorithm in this way?

Ultimately of course Google decides who is allowed to play the game. And while Panda 4.0 and other updates are part of the evolution and fine tuning of the algorithm, it is unlikely ever to be perfect. Human quality raters will continue to exist. And whether it is the algorithm or the Search Quality team that’s wielding the power, businesses will need to try and play by the rules.

Post Your Comment

Post Your Reply

Forbes writers have the ability to call out member comments they find particularly interesting. Called-out comments are highlighted across the Forbes network. You'll be notified if your comment is called out.

Comments

One flaw in this whole thing is your reference to ‘objective’ and how an algo can be objective but not humans. But the whole basis of this search rank argument is ‘Quality’ and that is a totally subjective human decision, which of course is the whole undermining problem with Google. You can’t solve the world with an algorythm, a human can judge a site to be better quality in seconds based on presentation, content, etc, these algos rely on links and so on which are all easily manipulated. These updates are just constant game of Google trying to stay ahead of the SEO monster it created, quite often with unintended collateral damage to smaller sites that can’t afford expensive SEO, kind of like the rich getting rich by hiring expensive tax lawyers to figure out how best to beat the system.

Hi Andrew, The definition of “objective” is complicated. Because people develop algorithms and not machines, so everything could be claimed as subjective instead of objective. To your arguments there are yes and no’s. I don’t believe that humans are able to judge a site based on presentation, content etc. A human is easily distracted and is not capable to quickly proof that the information he got is right and not just nicely presented. Google and other search engines solve this problem by including the mentions and quotations to pieces of information and count as positive references like in the science space where educational papers are more valuable if other scientists refers to this paper. That’s the basis of the Pagerank and still a valuable factor for a search engine because it works the same everywhere in the world. If you combine this signal with user signals and user behavioral data then you get a not so easy to manipulate signal. This is Panda and this is the goal of a company like Google to use their “objective” signals and mix them with a big amount of “subjective” signals to get the best SERP for all users. I believe that everybody can afford SEO when SEO is not the only channel to get traffic and engage with the users. If you have great content with an added value and use for example Social Media channels and influencers to spread the word, than you earn links and the attention of search engines. If you just create content and be frustrated that you don’t get traffic, then you should change your digital strategy. Best, Marcus

It’s one of the most descriptive guide I have read on Panda update. I have read articles on search engine journal, seomoz, and other top sites but Denis and Marcus explains the nitigrities of this update far better than anyone else. Thank you guys.

It’s sad to see some of the major sites taking a hit like RetailMeNot and MetaFilter. Google still needs to refine their algorithm and hence, they refreshes their algorithm every now and then.

It will also backfire for some new sites that have comparatively less authority than the big authoritative sites like Wikipedia, CNN, or NHS.

Interestingly, Google Authorship seem to become one of the prominent factor in determining rank in the near future. It will be interesting to see how things progresses.