Google's Response On Browser Specific Cloaking

Recently, Matt Cutts posted a detailed video on cloaking but did not touch on the specific issue of serving up different content or pages based on the browser accessing the web page. He did discuss serving up content based on geo-location and mobile users.

One webmaster/developer asked a question in the Google Webmaster Help forums about browser specific cloaking and if it was allowed.

Google's Matt Cutts provided a rare response in the forum, answering the question in short by saying, "the easiest explanation is to serve Googlebot a safe version of your page that would work in any browser that you support with the expectation that someone viewing the cached version of a page could be using any browser."

But he provides a really long answer as well and I figured most of you have no idea he posted on it. So here is his answer:

In order to answer this question well, I need to talk a little bit about cloaking. The short definition of cloaking is showing different content to Googlebot than to users. Historically, Google has taken a very strong stance against cloaking because we believe it's a bad user experience. Google wants to fetch and judge the same page that users will see if they click on a site in search results. We also made a longer video if you’d like to learn more: https://www.youtube.com/watch?v=QHtnfOgp65Q .

Deceptive or malicious cloaking would be showing Googlebot a page of text about Disney cartoons while showing users pornography, for example. Typically to help site owners avoid cloaking, we recommend showing Googlebot the identical content that a site's typical desktop web browser would see. For example, if the most common web browser to a site is IE7, then provide Googlebot with the exact same page that IE7 would get.

That advice is less helpful for companies that provide performance optimizations at a granularity that varies from browser to browser. For example, Chrome supports things like data URIs (to provide inline data in web pages) or WebP images that other browsers don’t support. So websites could return those sorts of things for visitors surfing with Chrome. Then the question naturally emerges about how to treat Googlebot?

The main litmus test for cloaking is whether you are doing something special or different for Googlebot that you're not doing for other visitors or users. In the vast majority of typical cases, it's preferred to treat Googlebot like a typical not-too-cutting edge (think IE7, for example) browser. However, in the very specific case where you're offering specific performance improvements with browser-level granularity, it can be okay to optimize the page based on the user agent where Googlebot is just another user agent and you take into account the capabilities of the Googlebot "browser" in the same way that you do with other browsers. For example, one specific example would be to provide data URIs to Chrome browsers (which does support data URIs) but not to Googlebot (which currently doesn’t support data URIs).

The main questions that the webspam team--which is responsible for enforcing our quality guidelines about cloaking--will ask if we get a spam report concerning a page is
- Is the content identical between user agents? (If the answer is yes here, you're fine.)
- If not, how substantial are the differences and what are the reasons for the differences?

If the reason is for spamming, malicious, or deceptive behavior--or even showing different content to users than to Googlebot--then this is high-risk. For example, we already provide ways for servers returning Flash to show the same page to users and to Googlebot, so don't do this for Flash; see http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=72746#1 for more info about Flash.

Remember, I'm talking about very small differences for performance optimization reasons, such as inlining images. The way that I would recommend implementing browser optimizations like this would be the following: rather than specifically targeting Googlebot by UA string, build an affirmative list of browsers that support a given capability and then just treat Googlebot as you would a browser that wasn't on the whitelist.

Again, this guidance applies only when you’re delivering really granular performance optimizations based on the capabilities of individual browser types. Anything beyond that quickly becomes high-risk, and it would be a pretty good idea to check with Google before doing anything too radical in this space. The easiest explanation is to serve Googlebot a safe version of your page that would work in any browser that you support with the expectation that someone viewing the cached version of a page could be using any browser.