Make Google Ignore JSESSIONID

Search engines like Google will often index content with params like JSESSIONID and other session or conversation scope params. This causes two problems: first the links returned in the Google search results can have these parameters in them, resulting in “session not found” or other incompatible session state issues. Secondly it can cause a single page of content, to be indexed multiple times (with differing parameters) this diluting your page’s rank.

Now there’s an even better way to handle this. Google has added an amazing new feature to their Webmaster Tools which allows you to specify how the GoogleBot indexer should handle various parameters. You can ignore certain parameters such as JSESSIONID, cid, and others, and also specifically not ignore other parameters such as productId, skuId, etc…

Log into your Google Webmaster Tools, and select the site you wish to work with. Under “Site Configuration” -> “Settings” there is a new section at the bottom called “Parameter handling”. Click on “adjust parameter settings” to expand the parameter handling configuration for your site. Sometimes Google will suggest various parameters it has discovered while crawling your site, and other times you just enter the parameters you want Google to ignore or pay attention to.

Google Webmaster Tools Parameter Handling Interface

This is a much more elegant solution to the JSESSIONID problem, and also allows you to easily handle other parameters your site may use for either session state or dynamic content generation correctly. The only downside is that this only impacts Google, whereas with the correct configuration my older two solutions can handle any Search Engine Bot. Maybe other search providers will or do provide a similar feature.

very good point, and thanks for bringing it up. Canonical tagging is very powerful and I’m hoping to see it used more and more often. It’s still “relatively” new to the scene and I haven’t seen a ton of sites making use of it. I’m glad to see you have adopted it and found it useful.

However, at least for many ATG based sites, many pages may take in a large number of content altering parameters (productId, skuId, reviewsPaginationIndex, etc…) so it may be much simpler to just ignore the “bad” parameters, rather than writing the code to dynamically construct the correct canonical link. It would be pretty easy to make a droplet which has a list of “bad” params, and generates the canonical link just by “cleaning” the requested URI. Not perfect, but simpler to implement.

is the issue that canonical tagging didn’t prevent Google from indexing pages with JSESSIONID on them, or that you already had URLs like that in the Google index, and now switching to canonical doesn’t remove them?