In the last blog post,
we discussed about Google Penguin updates. Today, let’s try to
understand Google Panda update. The first release of Google Panda was in
February 2011. While penguin updates went after back links, the main purpose of
panda updates are to watch the quality of website content.I believe by far
the biggest affect in search results were made by Panda updates. Panda updates
fights scraper sites and content farms. A scraper essentially copies content
from various websites using web scraping. Web scraping can be
done by simple manual copy and paste or to the extend of using
sophisticated web scraping software. On the other hand, content farms are
websites with large volume of low quality content often copied from other web
sites or being copied in multiple pages of the same webs site. Content Farms
usually spend efforts to understand popular search keywords and spin out
content based on those phrases.Thus Panda update was also called Farmer
Update :)

Unlike Penguin updates, Google
Panda updates are more frequent. It is expected that the algorithm has seen
close to thirty refreshes. As mentioned about quality of content is the main
focus of these updates. However quality doesn’t confine to just the
written quality of content. Some of the major factors that could
determine the quality of a website are –

Quality of
written content

Are you an authority about the
topic. Is your content original? Will a reader benefit out of the written
content you have put in your website? For example, this will be of utmost
importance in case of e-commerce sites where there is a tendency to put minimal,
duplicate or manufacture wording as product descriptions. These are the factors
that determine the quality of written content.

Duplicate content

Are you using the same
content in different web pages of your site (even if with minimal wording
changes), then you are risking of getting affected by Panda. One example where
this is affected unexpectedly is blogs. For example if you are using both tags
and categories in your blog, this could be considered as a duplicate. But don’t
worry, techniques like using noindex tags will help overcome this issue.
Another method is to use canonical tags to tell Google which page to
index. Duplicate content is not just limited to your website, it could be
outside duplicate content too. For example – micro-sites or your
own blogs where you either tend copy paste the same content or make minimal
changes. Here is a good Moz article that describesvarious duplicate possibilities in
the eye of Google.

Let’s take an example to
understand better the above to points. Consider the book –
Half Girl Friend inAmazon or Flipkart.
As you can see the product description is kept different in both. Both are
different from the description Chetan Bhagat gave
in his website. Also we could reach the product page in Amazon or Flipkart in
multiple navigational paths or URLs (like category, search direct etc.); but
google indexes like the main product page.

Ad to content
ratio

One of the elements used by the
algorithm to test the quality is number of ads shown. While there are no
optimal number, its best to be judicious while selecting the number of ads
shown. While there are no optimal ratio for this, keeping the number of ads to
minimal always keep you safe. It’s expected that algorithm has really matured
to understand the ideal duplicate and ad to content figures.

Technical aspects

Technical aspects that
differentiate the quality of a website including performance metrics such as
page loading time, UX aspects like navigational ease & design and
factors like proper re-directs, 404 pages and so on.

While the overall check of panda
updates was on quality of the websites, third version of panda updates
(specifically 3.3 onward) focused also on few other aspects. One improvement
was around local search results. This update concentrated on improving on
showing relevant search results from a proximity of user/content point of view.
Another improvement was related to link evaluation(usage of link
characteristics like wording, # of similar links etc to understand the content
of the linked page). This was taken to an altogether different level with
penguin updates. Freshness of the website was also considered in this
update.

Google panda updates
starting 3.0 were very regular that it became difficult to specifically focus
on each update. Industry experts even opined that the dead Google Dance is back with these
regular refreshes. Now the panda refreshes are tracked based on consecutive
numbers rather than following a versioning system. From late 2013, Google
started to include panda updates with index
algorithms; thus becoming part of normal algorithm changes Google makes
through-out the year. Panda 4.0 was released in May this year. This update
specifically improved on how Google views authority and originality of
articles in the web. Here is a good case study on this from Search Engine Land. Another affect out
of the Panda 4.0 was on how PR websites are ranked. Many PR websites were
affected and had to re-define their publishing guidelines. The latest one was
4.1 or 27th Panda update in September
2014 which is expected to help small websites with original content. Also those
websites doing recovering actions are expected to benefit from this update.

One thing we can understand by
studying about these updates and their history is that all SEO practices
revolve around these updates (like link building, content marketing and so on…)