Google’s Caffeine Update: Better Indexing & Fresher Search Results

On August 10, 2009, Google announced Caffeine – which would become one of the most important updates in the search engine’s history.

The Caffeine Update was so massive that Google provided months of what they called a “Developer Preview”. That is to say, there was so much at stake Google actually gave SEO professionals and developers early access to it so we could report any issues we found.

What Exactly Was the Google Caffeine Update?

This new system allowed Google to crawl and store data far more efficiently.

In fact, by Google’s own account they were able to not only increase their index but provide far fresher results (50 percent fresher by their estimates).

So how did this work?

Basically in their old indexing system, pages and content types were put into a category based on the perceived freshness requirements (as it relates to this update at any rate).

Different crawlers were sent out, some looking for changes and others reindexing changed pages – all based on the classification of the content in question.

If a site was in the fresh category it was crawled by different bots that would add the content to the index quickly, but for most sites, their content would be reindexed every couple weeks.

Of course, this sets up a scenario where important and fresh content can be missing from the index due to a site classification.

With Caffeine, Google gained the capability to crawl, collect data, and add it to their index in seconds meaning far fresher information was available across a wider range of sites.

Further, it was built with an understanding of the growth ahead and in how changing devices and media types can impact the resources needed. (Remember this when you read the piece here on Search Engine Journal by Beau Pedraza on the Hummingbird Update, which followed in 2013.)

Why Google Launched Caffeine

Caffeine wasn’t an algorithm update; in fact, it wasn’t an attempt by Google to impact rankings at all.

No, Caffeine was a complete rebuild of their indexing system.

To understand the reason for this one simply needs to look at the changing web.

The Internet in 1998, when the initial Google indexing system was designed, was just a bit different than it was in 2009.

When the index was initially built out, there were 2.4 million websites and 188 million people on the internet worldwide.

By 2009, there were 100 times more websites at 238 million and almost 1.8 billion people trying to get to them with no end in sight to the growth in either.

Add to this the significant changes in the types of media needing indexing – with video use skyrocketing, images, maps, and other data added to the mix.

The old index just wouldn’t cut it.

You can think about it as you would your kitchen cupboards.

On a daily basis you can take things out and put things in and things work just fine.

But what happens when your partner moves in with you and you have a couple kids (or 100 as would be the case in this metaphor for Google).

And let’s also add to that an influx into the house of new types of foods.

It might be time not just to rearrange the products but completely rebuild the shelves.

That’s what Google did with Caffeine.

Who Was Impacted By Caffeine?

Unlike a typical Google update, there wasn’t a negative impact on specific sites. However, some sites did see a drop in rankings and/or organic traffic (is it just me or does that sound like something Google themselves would say?).

What I mean by this is that on launch, Google gained the capacity to crawl faster and produce fresher results from a larger index. Sites that covered new stories quickly were rewarded.

Where sites would only have this advantage if they were in the fresh category in the old indexing system, now anyone could take advantage of this speed.

This could be seen as a hit against those who were in the fresh index previously; personally, I saw it as a leveling of the playing field.

Rumors About Freshness

One thing this update had in common with almost every other update in Google’s history is that rumors spread quickly about how to optimize for it.

Because the update itself revolved around Google’s ability to index content and content changes rapidly there were all sorts of blog posts and articles claiming that updating content frequently yielded an SEO advantage and that simply creating new content was a signal as well.

This update had nothing to do with adjusting ranking signals.

If fresher content yielded better rankings, this was due to other algorithms, not Caffeine.

It may have been made possible by the Caffeine Update, but the update itself didn’t impact results in this way.

Impact On Search

While the Caffeine Update had little direct impact on rankings, outside allowing for faster indexing of new content, it set the stage for some massive changes to come.

The pre-Caffeine index could not keep up with the ~1.3 billion websites on the internet today, nor could it deal with the variety of devices, data formats, and query input types we now take for granted.

If you enjoy voice search, RankBrain, and the wide variety of search types from video to robust and varied news – you have Caffeine to thank.

Remember this as you continue reading the variety of articles that we will be publishing on SEJ over the coming days on the algorithms that followed. Most (if not all) wouldn’t be possible without the major change that most of us wouldn’t have noticed if not for the announcements.