If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Making AJAX Applications Crawlable using PHP and jQuery.

Unfortunately, there is not much documentation on the web that pertains to this (none that I could find, at least). So here's a quick PHP tutorial on how to go about this. I'll try to be as concise as possible.

Problem: Googlebot is unable to crawl content which is loaded via AJAX into a webpage.

Solution: Using hash fragments as a signal to Googlebot to use an alternate URL ("ugly URL") to create a "snapshot" of the webpage, including the AJAX-loaded content, to be crawled.

Definitions:

Pretty URLs are the URLs which is seen client-end which loads dynamic content. Pretty URLs consist of #! with an identifier/query string at the end which tells the webpage what content to load.
For example, http://www.example.com/index.php#!content1.

Ugly URLs are the URLs that, when navigated to, display content that was pre-loaded from the server. These URLs, instead of having #! as the delimiter, have _escaped_fragment_=. For example, a pretty URL of http://www.example.com/index.php#!content1 would look like this to Googlebot: http://www.example.com/index.php?_escaped_fragment_=content1.

1. Tell Googlebot that the page is crawlable via hash fragments. On the head of your page, insert this:

Code:

<meta name="fragment" content="!" />

2. If you are loading your AJAX content as soon as the page loads (which presumably you're going to do) then you want to get the hash fragment when the document is done. Here's an example of how it's done:

findHash() is going to find what the hash of the URL is, will strip the pretty URL delimiter out (#!), and will request another page that we name data.php whose query string of _escaped_fragment_= (ugly URL) is going to have the value of window.location.hash. There is an event called "hashchange" which is an event listener which will listen to when the hash is changed, and then execute findHash() again. The page, requested via AJAX, will display the returned data within a div that we named with the ID of content.

And now when the page is requested by Googlebot, it will read the hash fragment replaced by _escaped_fragment_, and thus read AJAX-created pages.

Here are some tips to remember:

When _escaped_fragment_ is activated, you only need a "snapshot" of the page. This means that you really only need working links, text and the same appearance of the webpage. Everything else isn't necessary, since Googebot won't be doing much except looking at the page.

jQuery is your friend. It will make writing everything 10x faster, especially for this.

I'm not entirely sure about that, but I think it's quite probable (if they haven't already). I know that large websites such as Twitter and Facebook adopt this technique. Although the ability for Google being able to crawl AJAX-created webpages is reason enough to make AJAX applications this way.

If you do find out if other search engines adopt this technique, please post it here if you can.