Demystifying Google’s June & July Algorithm Changes

The biggest SEO story last Friday was Google’s announcement that they would begin incorporating “the number of valid copyright removal notices [they] receive for any given site” into their ranking algorithm.

Then, a few hours later, Google quietly released a list of June and July algorithm changes. Due to the timing of these announcements, this list of changes did not receive a comparable amount of attention in the community.

And that’s where I come in! This post is all about the June and July changes, and hopefully, it will help everyone get up to speed on Google’s quieter announcements.

Now, let’s get to it…

June Changes

Google made 57 changes in June. 29 of these changes have names (e.g., “Bamse”, “Vuvuzela”, etc.), and 28 of them have an ID number (e.g., #82293, #82496, etc.).

The changes are grouped into 17 “codename” categories, and this pretty pie chart breaks down the percentage of changes in each of these categories:

Unfortunately, the codename associations don’t always make sense. For example, there are two codenames for image-related changes (“Image” and “Images”), and a few of the changes appear to be misclassified.

To handle these inconsistencies, I’ve reorganized the changes and slightly modified a few of the codenames. With that in mind, here is a summary of the categories:

Answers

This category contains the most changes, and almost all of those changes deploy improved natural language processing (NLP) for a given query type. Here is a list of query types that have been updated:

Sports – Live results have been added (or updated) for EURO 2012, racing (Nascar, MotoGP, and IndyCar), and MLB.

Calculator – The calculator can now handle more sophisticated queries.

Finance – The display and recognition of spoken finance searches on mobile devices have been improved.

Miscellaneous – A lot of other random query types (e.g., movie showtimes, flight statuses, etc.) are now being detected and displayed more effectively.

Page Quality

This is the second largest (and arguably the most important) category. Unfortunately, the category’s change descriptions are also the least helpful.

Specifically, 4 of the changes in this category have the exact same vague description: “This launch helps you find more high-quality content from trusted sources.” And another change has a very similar description: “This change updates a model we use to help you find high-quality pages with unique content.”

I don’t expect Google to hand over the keys to the castle, but honestly, I can do without 5 iterations of the same Googlespeak.

Fortunately, this category does contain one interesting revelation. Two of the changes (PandaMay and #82353) explicitly mention data refreshes for the Panda algorithm. That makes sense because Google publicly reported two Panda updates (Panda 3.7 and Panda 3.8) in June (see SEOmoz’s algorithm history for confirmation).

Then, there’s another Panda-specific change (GreenLandII) that is described as follows: “We’ve incorporated new data into the Panda algorithm to better detect high-quality sites and pages.”

Other Ranking Components

Piggybacking the Bigfoot discussion in the previous category, this category contains two changes that might account for the lower domain diversity observed by Dr. Pete.

Both of the changes (#82541 and NoPathsForClustering) claim to be “one of multiple projects that [Google’s] working on to make [Google’s] system for clustering web results better and simpler.”

Many people would dispute the “better” claim, but at least it’s clear that Google made changes to their clustering algorithm.

Another important change in this category is called ng2, and it provides “a better ordering of top results using a new and improved ranking function for combining several key ranking features.”

Obviously, that description leaves a lot to the imagination, but it definitely sounds important. It could also be responsible for the Bigfoot Update (especially since Dr. Pete’s measuring system focuses on the top 10 results — i.e., “top results” — for various keywords).

Snippets

The changes in this category fall into 2 groups: improved sitelinks and auto-generated page titles.

One of the sitelinks changes removes “boilerplate text in sitelinks titles,” and the other “improves clustering and ranking of links in the expanded sitelinks.”

The auto-generated title changes are meant to use synonyms to generate accurate titles and select better titles to display. Unfortunately, as Ruth Burr pointed out, Google has an odd definition of “accurate” and “better” when it comes to titles: Watch Out for Long Title Tags – An SEOmoz Case Study.

Synonyms

This category is full of synonym-related changes. One change “improves use of query synonyms in ranking.” Another change “improves efficiency by not computing synonyms in certain cases.”

My favorite change “updates [Google’s] synonyms systems to make it less likely [they’ll] return adult content when users aren’t looking for it.” No more inadvertent pr0n results… what a buzzkill!

Images

The changes in this category are all image-related. Whether it’s making images trigger more frequently, making the returned images more topical, or improving the efficiency of image search, this category covers it.

Miscellaneous

This category includes a wide range of random changes that are not related to the previously discussed changes. These include spelling system changes, search freshness improvements, etc.

July Changes

Google only made 22 changes in July. 11 of these changes have names (e.g., “yoyo”, “popcorn”, etc.), and 11 of them have an ID number (e.g., #80568, #83166, etc.).

The changes are grouped into 13 “codename” categories. Here’s the relative popularity of each category:

If I could only use one word to describe the July changes, that word would be boring. If I had two words, they would be really boring (you get the idea).

With that in mind, let’s blaze through these changes in one big list:

Snippets – These changes attempt to make the sitelinks more useful and the search result snippets more relevant.

Autocomplete – This category is all about improving autocomplete predictions.

Indexing – These changes improve the efficiency of elements of Google’s indexing infrastructure.

June Repeats – This category includes changes that are very similar to changes that appear in the June list (e.g., improved detection of adult content, improved movie showtimes, improved clustering for web results, etc.).

Miscellaneous – These changes affect various parts of the search experience (e.g., including upcoming events in the Knowledge Graph for city-related searches, fixing a bug in the freshness algorithm, etc.).

Page Quality

This is the only category that deserves its own section in July. However, neither of the changes are particularly mind blowing. One is a Panda data refresh (JnBamboo), which probably corresponds to Panda 3.9.

The other change in this category (Panda JK) deployed the Panda algorithm to search results in Japan and Korea. It’s unclear what percentage of queries this change impacted.

What Do You Think?

I’d love to hear from you in the comments. Which of these changes had the biggest impact on the search results? How many changes do you think Google omitted from this list? What other observations have you made about the June and July changes?

About The Author

Steve Webb is an SEO audit specialist at Web Gnomes. He received his Ph.D. from Georgia Tech, where he published dozens of articles on Internet-related topics. Professionally, Steve has worked for Google and various other Internet startups, and he's passionate about sharing his knowledge and experiences with others. You can find him on Twitter, Google+, and LinkedIn.