This post begins with a particular dilemma that SEOs have often faced:

websites that use AJAX to load content into the page can be much quicker and provide a better user experience

BUT: these websites can be difficult (or impossible) for Google to crawl, and using AJAX can damage the site's SEO.

Fortunately, Google has made a proposal for how webmasters can get the best of both worlds. I'll provide links to Google documentation later in this post, but it boils down to to some relatively simple concepts.

Although Google made this proposal a year ago, I don't feel that it's attracted a great deal of attention - even though it ought to be particularly useful for SEOs. This post is targeted to people who've not explored Google's AJAX crawling proposal yet - I'll try to keep it short, and not too technical!

I'll explain the concepts and show you a famous site where they're already in action. I've also set up my own demo, which includes code that you can download and look at.

The Basics

Essentially, sites following this proposal are required to make two versions of their content available:

Content for JS-enabled users, at an 'AJAX style' URL

Content for the search engines, at a static 'traditional' URL - Google refers to this as an 'HTML snapshot'

Historically, developers had made use of the 'named anchor' part of URLs on AJAX-powered websites (this is the 'hash' symbol, #, and the text following it). For example, take a look at this demo - clicking menu items changes named anchor and loads the content into the page on the fly. It's great for users, but search engine spiders can't deal with it.

Rather than using a hash, #, the new proposal requires using a hash and an exclamation point: #!

The #! combination has occasionally been called a 'hashbang' by people geekier than me; I like the sound of that term, so I'm going to stick with it.

Hashbang Wallop: The AJAX Crawling Protocol

As soon as you use the hashbang in a URL, Google will spot that you're following their protocol, and interpret your URLs in a special way - they'll take everything after the hashbang, and pass it to the site as a URL parameter instead. The name they use for the parameter is: _escaped_fragment_

Google will then rewrite the URL, and request content from that static page. To show what the rewritten URLs look like, here are some examples:

As long as you can get the static page (the URL on the right in these examples) to display the same content that a user would see (at the left-hand URL), then it works just as planned.

Two Suggestions about Static URLs

For now, it seems that Google is returning static URLs in its index - this makes sense, since they don't want to damage a non-JS user's experience by sending them to a page that requires Javascript. For that reason, sites may want to add some Javascript that will detect JS-enabled users, and take the to the 'enhanced' AJAX version of the page they've landed on.

In addition, you probably don't want your indexed URLs to show up in the SERPs with the '_escaped_fragment_' parameter in them. This can easily be avoided by having your 'static version' pages at more attractive URLs, and using 301 redirects to guide the spiders from the _escaped_parameter_ version to the more attractive example.

E.G.: In my first example above, the site may choose to implement a 301 redirect fromwww.demo.com?_escaped_fragment=seattle/hotels to www.demo.com/directory/seattle/hotels

A Live Example

Fortunately for us, there's a great demonstration of this proposal already in place on a pretty big website: the new version of Twitter.

If you're a Twitter user, logged-in, and have Javascript, you'll be able to see my profile here:

Another Example, With Freely Downloadable Code

Feel free to have a play and see how that page behaves. If you'd like to see how it's implemented from a 'backend' perspective, hit the download link on that page to grab the PHP code I used. (N.B.: I'm not a developer; if anyone spots any glaring errors, please feel free to let me know so I can correct them!)

More Examples, Further Reading

The Google Web Toolkit showcase adheres to this proposal; experimenting with removing the hasbang is left as an exercise for the reader.

The best place to being further reading on this topic is definitely Google's own help pages. They give information about how sites should work to fit with this proposal, and have some interesting implementation advice, such as using server-side DOM manipulation to create the snapshot (though I think their focus on this 'headless browser' may well have put people off implementing this sooner.)

30 Comments

Great explanation using the Twitter example Rob. I still think Ajax crawling is flawed as the bot and the browser are treated different. I wrote a YOUmoz post back in May with some of my thoughts http://www.seomoz.org/ugc/exploring-googles-ajax-crawling

From a usability aspect the developers should be building content that is accessible by both ajax and non-javascript enabled users. It does add to development time and such, but you give the ability to enhance the website to those who are running with JavaScript turned on.

The only time you should potentially decide towards having things run in Ajax only is when you control the users. Such as if you are building a control panel for staff at a business, you can say "you must have JavaScript turned on".

Advocating anything to do with content in AJAX from an SEO perspective is wreckless if you don't also point out the fact that this is still a bad practice. Just because Google can now supposedly index content inside AJAX does not mitigate, in the least, the several flaws with it. Let's separate out the pure usability issues and focus exclusively on the SEO aspects.

Google Only

First, just because Google can, does not mean any other search engines can. So right away, you're excluding all those people who come from other engines.

Topic Optimization Limits

Most AJAX content is presented on a page that has multiple AJAX links, typically tabs. You can NOT properly optimize a page for all the varying content displayed inside AJAX tabs. The topical focus of that page becomes trash.

Designer /Developer Free-For-All

As soon as you tell designers or developers Google can index AJAX by this new method, they completely go off the deep end and overload what should be mission-critical content, into AJAX displays, thus further killing SEO.

The Bottom Line

The bottom line, if you care about proper SEO best practices, is to NOT use the Google protocol for indexing AJAX content.

Well if FB and Twitter are adopting it then it can't be too wreckless.

My understanding was that no matter what you do the other search engines won't pick up AJAX, so any search engine creating strategies to crawl & index AJAX is a breakthrough. I will be the first to admit I am not a developer so my understanding may be elementary at best.

Actually if Twitter and FB choose to adopt it, it just means they can. They don't care about real SEO. neither site ever has. And in fact, because they do, people who don't know better then think exactly what you just stated. So it proves it's wreckless

AJAX and SEO was a big problem few year ago! We had doubt that will works with SEO on our shop site(main word is "tepisi")http://www.aloser.rs, but google crawle it good. We will continue using AJAX in purpose of SEO.

I have created a b2b membership community site and use the the google maps api which contains our membership data in AJAX. My developer is creating a site map and has just created html pages to contain the same data so that it can be crawled (these will be dynamically generated with membership registration). Now he is using ISAPI Rewrite to make the urls friendly and then will make the redirect work and the crawling can begin. Is this the correct approach? Secondly, as a visitor clicks to view a members data there is an on command in the map view which contains the member content and I want this to count as a pageview but no one will agree that this can be accomplished. I need a believer to explain how this could be done?

Hi. After reading your post i thought i had found the solution that i needed. I have had a little test on a sites that simulates google robot view. Unfortunately it seems that this method doesn't work if that simulation trustworthy. the site i tested it, is www.smart-it-consulting [dot] com/internet/google/googlebot-spoofer/index.htm
I hope someone can confirm that i'm wrong. Thanks

Hello,I was a doubt after reading this post and after testing it.The demo shown in this post shows how to present content to users (!#) and how to present content for search engines (_escaped_fragment_), however, in both situations the meta tags remain the same.How can we solve this problem to results of searches search engines display different titles?Thank you for your help!

I have this site written in AJAX. Recently we started to use hashbang for Google to be crawled. Id like to ask you if my developer did it right. Sorry for urls, but I had to use live example. We can get rid of them later.

1. Main page

http://dyskonti.pl is visible for G. on http://dyskonti.pl/#! (and is being indexed corectly) but on http://dyskonti.pl/#! I can see other things than on pure http://dyskonti.pl - isnt it some kind of cloaking?

2. Categories

http://dyskonti.pl/#!dom-51 should be visible in SERP as http://dyskonti.pl/dom-51 (with its own title and description - urls similiar to Twitter)

So I go to http://dyskonti.pl/dom-51 and it redirects me to http://dyskonti.pl/#!dom-51 and everything seems to be correct. In SERP url is either ok - http://dyskonti.pl/dom-51. But when I check headers for http://dyskonti.pl/dom-51 it says 302. Shouldnt it rather be 301? Inside #! version of dom-51 G. can see canonical tag pointing to http://dyskonti.pl/dom-51. We also use that clean url in sitemap.xml.

Wow its complicated.

In G. SERP it looks like everything is ok, but I am concerned if we did all we could to improve seo.

Thankyou rob, finally i found the problem solver. I'm not expert in programming, how about the URL if has more one parameter? e.g index.php?animal=cat&furit=banana and I want to split value in different div element.

First you should understand that the AJAX crawling is used only if you have dynamic parts in the page that are not represented in the basic HTML that is loaded in the URL.

If for if for example you have buttons on your site that enable users to choose cat and banana and the content is changing without a roundtrip to the server (meaning no reloading of the page) it means you have AJAX content on your site.

In that case the URL a user see on the page will change to something like index.php#!animal=cat once the user clicked on the cat button and to index.php#!animal=cat&fruit=banana.

This link should be included in the actual HTML of the page and the BOT will get to your site with this URL index.php?_escaped_fragment_=animal=cat&fruit=banana

BTW - Google say they will not include the _escaped_fragment_ in the search results and the clean URL with the #! will be used so I think there is no need for redirects.

I kind of like how the hashbang ("#!") is called "shebang" in Unix. It lends itself to Ricky Martin or William Hung jokes.

It doesn't look like Facebook's #! implementation is actually serving anything for the _escaped_fragment_? The Twitter 301 is nice—I always wondered if doing that or a rel canonical would be better for this stuff? It seems like either way you’re leaking some google juice?

i actually was dig some tatics about tracking for AJAX content, basically required for a form process tracking, i then implemented virtual pageview, which allow you to see each AJAX-subpages performance.

This is also involves 301 redirects, so i guess it can also be used for goal tracking?

I was thinking of just catching either the IP or the browser for the common search engines and create a special page that would display all the content accessed through ajax by normal visitors to the same URL. The article will be broke into sub-sections accessed through Ajax with little to no scrolling. The website is Right2Say and plan to actively PR it starting December 2010. Would it be smart to use PHP logic to direct Google and like to a page showing all the content to the same URL?

I think the guys over at Internet Marketing podcast covered this a week or two back. The way I understood it is, the page could still be indexed even without the "hashbang" but would strictly be "as is." With the '#!' you are able to tell Google you want it indexed, correct?

So here's a question:What if your content is completely crawlable but you are using ajax for seamless page transitions, (i.e. page doesn't refresh). So the site is using the hash tag # to allow users to use the back button, and to bookmark content, but really only part of the page is in AJAX. However our URLs in the browser bar are the hash values http://mysite.com/#/custom-url-string. But we also have the SEO Friendly URL we get link credit for, http://mysite.com/custom-url-string.aspx, if we start using the hashbang, http://mysite.com/#!/custom-url-string, we are then going to have a split in link efficacy.