SEO Techniques for Large Sites

Large, complex websites often face a challenging scenario: getting product-level (or item-level) URLs to rank. This problem is most acute on enterprise-level websites with hundreds of thousands of unique product SKUs. Good examples of sites doing this well include Amazon, Zappos, and Target on the e-commerce side, and Wolters-Kluwer (LWW.com), Trulia, and Wikipedia (but it doesn't really count, does it?) on the information side.

News sites such as The New York Times are altogether a different animal. QDF scenarios demand unique strategies.

Throughout this article I'll refer to "items" and "products" interchangeably. While there absolutely are differences between what tactics a WebMD or Trulia will undergo, and an Amazon or Zappos, the ultimate strategy of ranking individual listings or products is inherently similar. But, in the interest of keeping things simple, I'll mix terminology rather liberally.

Why It's So Dang Hard

It seems like most large websites have acute pain (sometimes terminal) at the product level.

URL structure often presents challenges for large sites, especially the deeper into the architecture one navigates. While major categories are often static and canonical, products and items are usually mutable, expiring, revolving URLs, often appended with session information. Sites that handle this well, such as Amazon, use a wide variety of techniques. In the case of Amazon, canonical standards for product URLs are retained by means of cloaking: users are given URLs with session data in the query string, and spiders such as Googlebot and Bingbot are given base URLs only.

There are other ways to accomplish the same thing, via rel canonical meta tags and URL normalization and parameter handling in the Bing and Google webmaster toolsets, respectively. However, and almost without exception, there is opportunity to improve product-level canonicalization signals on enterprise sites. Even for massive brands.

Every large site struggles with canonicalization, even Apple.

However, even with strong canonicalization of item-level URLs, problems can surface. Pagination is a common struggle. While there are certainly specific SEO techniques that can help pagination, it seems to be a moving target. See below for a detailed breakdown.

Product lifecycles are always a pain point. Some products expire or go out of stock and will never come back, while others will return in a set period of time. Still others are "evergreen" and always in stock. How each need be handled from the SEO side depends on several factors.

The overarching challenge that presents itself here is ensuring sufficient link equity flows down from the home page and major category pages to the products themselves. This can be accomplished in multifarious ways depending on the site, market, design goals, business goals, etc. One can't really throw pointed remarks at this problem and have them make sense in every situation, every time. Like most areas of SEO, item-level work demands scenario-specific, fluid thinking.

What Techniques Work?

Rather than focus on a definitive set of techniques, the problems inherent in product-level SEO demand a holistic approach. This work is reflective of the challenges inherent in SEO as a whole. Indeed, maximizing product visibility in organic search truly distills our work under a lens.

However, taken holistically, there are certainly best practices to be found. I will cover each of them in turn below.

What About Users?

No treatment of SEO should focus on search engines at the expense of users (and neither should it focus on users at the expense of SEO). However, this is dangerous ground. Without disciplined user testing and information architecture experience, the end result - despite pure intentions - will be short of the mark, leaving both sides lacking.

Unless you're working with a team versed in site architecture, usability, and SEO, it's best to work with experts in each discipline. Yes, it can be done!

With that out of the way, let's explore the actionable stuff.

Orders of the Work: The Home Page

The home page naturally holds the lion's share of a website's SEO value. This is the first area to focus on when thinking about primary navigation elements and internal links. Since the home page (and, by progression, major category pages) represent a site's overall authority and equity, this is valuable territory to carefully position links to the most important portions of the site.

My general philosophy for pushing equity to product pages (on big sites) comes down to this one idea: focus on your primary categories, then sub-categories, then product URLs themselves. It's a mistake to focus off-page links on product-level URLs because it's not scalable. Product-level URLs only make sense as they relate to the site hierarchy as a whole. I'll run through each aspect of that below.

There will always be at least some internal links from the home page to products. Normally, these are centered on specific promotions, seasonal pushes, and the like.

Some sites may consider linking permanently from the home page to a select few products that are extremely important from the business point of view (e.g., the Amazon Kindle). However, linking to many products isn't usually possible on the home page. Additionally, this valuable real estate isn't normally controlled by SEO teams, and tends to change often. For these reasons, it makes more sense to focus on category pages for the meat of your internal linking.

The global navigation is normally going to include links to each of the primary categories on the site. These pages, in turn, feed into sub-categories and individual items. Therefore, ensure the global navigation doesn't leave behind any important category URLs.

More importantly, ensure primary category URLs are given prominence, as opposed to secondary URLs. For example, if an e-commerce site has products cross-categorized in both brand and activity sections, and they have end-state URLs that are being referenced by canonical tags elsewhere, you'll want to ensure the actual canonical parent category is listed in the global navigation. This creates a clear navigation and crawling pathway to the canonical product URLs, rather than the duplicates that are referencing canonical products. Sound confusing? It can sure get that way!

Make sure global navigations don't contain links to everything under the sun. These things can get big, quickly. On a site with 100 distinct categories, consider featuring the top 50 most popular and important categories in the global navigation, with "more…" links to the remaining inventory.

For category-level URLs, it is valuable to have category-specific navigations. Amazon is a company that does this quite well.

Finally, ensure items are no more than three or four clicks at most from the home page. This can often be problematic, but is important enough to start thinking creatively about.

To recapture our notes for the home page:

The home page has the most authority and equity to share with other URLs. Consider linking to a few important products if it makes business sense. Otherwise, focus on linking into category pages. The global navigation is key. Ensure canonical pathways are given. Try to feature contextual sub-navigations deeper in the site. Ensure item-level URLs can be reached in four clicks or less from the home page.

Orders of the Work: Internal Linking on Product Pages

The next order of business is to focus on product-level URLs. Here is the place to organize tight, relevant links across the site. This has the effect of relating products with each other, as well as flattening the architecture and bridging products across categories.

Most solutions around this area focus on "recommendation engines" normally built in JavaScript and invisible to crawlers. While these tools can drive revenue for e-commerce sites, most of them are disappointingly SEO-unfriendly. It's by necessity, partly, because these engines require session information to inform their on-the-fly recommendations.

Quality is important here, and less can be more. Four to eight relevant, topical links on product pages can really move the needle when dealing with a large site. But it's quite hard to do this well, as it requires quality data and lots of it. One of the best sites to do this early on was Shopping.com.

While related linking between products can be powerful, it is normally inadvisable to link from product URLs to multiple different categories. Some aspect of this upward and downward linking across categories is likely to occur, but it should be de-emphasized in order to keep the most tight and narrow crawl path for search engines.

Also, from the user perspective, links to different categories from the product-level are not likely to be useful, unless they're highly relevant. Cross-category links can be valuable, however, when a user and search engine is presented with pertinent, relevant links. Keep them on point, as far as possible. (Certainly, links to primary categories and sub-categories will always be available in navigations. I'm speaking only to links featured prominently as modules on product pages.)

To recapture our notes for product pages:

Related product links are powerful.

Less is more, and quality is more important than quantity.

Quality data, and lots of it, are required to do this well.

Focus related links across the product level.

Always link prominently to the category parent. Relevant cross-category linking can be used with success, too.

One final bit of advice: since features such as related links can take a lot of company cycles, it's often smart to seek outside support. Bloomreach is a company that helps support internal linking initiatives (Note: We aren't affiliated with Bloomreach in any way).

Orders of the Work: Category Pages

As I've illustrated, building equity into category and sub-category pages, which then flow to item-level pages, is essentially the secret to giving items maximum visibility in search.

On large sites, it simply isn't scalable to attempt to link (internally or externally) to product pages on their own. They have to form a part of the site as a whole. In so doing, they will naturally benefit from many internal links provided the architecture is well established and the taxonomy concise.

Adding a "top products" section to each category is recommended here. Therein products can be showcased that are especially important from a business or search competition and volume standpoint. These will have more prominent internal linking and therefore should benefit by receiving more of the site's equity.

Consider building external links into the primary category pages. Unless you have a selection of products or items that are largely evergreen, don't bother with external link efforts at the deeper product level. Concentrate a level or two higher, and use those "hubs" as focal points that can then pass equity further downstream to products. However, leveraging social media at the product-level can be a powerful technique and is quite scalable.

To recapture our notes for category pages:

Focus on categories as your primary "hub" to obtain, and pass, equity further down to products.

Consider adding a "top products" section to each major category.

Build external links (and social mentions) at the category level, unless a single or set of product(s) is so important it rationalizes its own link development.

Orders of the Work: XML Sitemaps

XML sitemaps are essential URL lists. It's an odd feeling to submit lists of URLs to search engines in 2011, but ironically, it can be quite effective. Bing, especially, seems to respond to high-quality sitemaps, and these have become more valuable for Google over the years.

The most essential thing to keep in mind with XML sitemaps is quality: these files should be very clean and precise, without redirects, errors, or duplicate URLs. Secondly, they should be well organized and broken out by category or page type. Sites doing a good job with this have the advantage of knowing exactly what indexing looks like for the type of page or category the URL belongs.

URLs at the item level should ideally be separated into their own sitemap, or set of sitemaps. This way SEO teams can see how indexing for these hierarchically low-level URLs is faring relative to the rest of the site.

To recapture our notes for XML files:

Segment XML files by page type, category, or another business taxonomy.

Product- or item-level URLs should have dedicated XML sitemaps.

All URLs should be "end state" and clean of redirects or duplicate content.

First, a word of caution: SEO techniques for pagination seem to be a moving target. We have traditionally operated on the assumption that URLs annotated with rel canonical wouldn't be fully crawled. In other words, links within pages wouldn't be followed, anchor text and PageRank would not be passed, and the URL would simply be "soft 301'd" to its canonical target. However, that may change based on the latest information from Google's Matt Cutts. At least, you can bet we'll be testing it!

Cutts recently said that links on pages annotated with rel canonical would still be crawled, based on the overall PageRank of the URL, among other factors. From this information, it appears rel canonical is a totally separate (i.e., distinct) process from crawling. This likely has ramifications for how Google handles "noindex, follow" and canonical meta tags in concert, as well.

There is a fairly big ramification of this in how pagination is treated. Our methods for handling it typically employ "noindex, follow" on paginated URLs (2, 3, 4, etc.), and no use of rel canonical except to self-reference in cases of duplicate URLs; certainly, no use of rel canonical to reference page 1, since that would prevent links on deeper pages to get crawled. However, that may now change and it's something we'll be testing.

An additional technique is to create XML sitemaps specifically for pagination. That is, XML files that only contain deeper pages handled via "noindex, follow" and/or other techniques. The benefit of this is to isolate factors and get a clearer picture of how products or items on paginated URLs are being crawled.

Orders of the Work: Expired Pages

How expiring products are handled is absolutely key, and can sometimes present large opportunities for SEO teams. There are likely going to be different rules based on the type of product and its lifecycle. Here are some common scenarios:

Products expire or sellout and are never re-stocked.

Products expire or sellout and are re-stocked again.

Products are strictly seasonal and totally unique.

Each one requires its own approach.

Products or items that will never come back can be handled in multiple ways. Many e-commerce sites will simply 404 or 410 the URL. In some cases, it may be 301'd to a relevant category. In others, the page may return a 200 and have messaging and a call-to-action prominently displayed, upselling or recommending other close matches. This is potentially dangerous ground, however, and companies should proceed cautiously to ensure visitors aren't turned off or perceive a bait-and-switch technique.

Here are some strategies for expired products that may be helpful:

For products that will never come back, return a 200 response code and add prominent messaging and a call-to-action for relevant recommendations. Zappos does this with its Dead Products.

For those items that will someday return, consider using a 302 redirect to point them at the relevant category. When the item returns, the 302 can be removed.

As an alternative to upselling on expired product pages, consider using a 301 to permanently redirect that URL to the closest (lowest-level) category parent. Consider putting old expired inventory on an entirely different section of the site, like a subdomain. You can then control the presentation to ensure it doesn't offend or alter your visitors' perception of the site. And, while it's more of a second-order concern, you can also limit potential issues that may incur from bounces from expired products in SERPs.

Orders of the Work: Search Presentation

The page title and meta description (normally presented in search results as titles and snippet text, respectively) are quite valuable and can strongly influence a listing's click-through rate (CTR). As such, they may also play into the rankings of a page.

Visibility and CTR for products and items in search can be influenced in part through smart use of microformats. Rich snippets can really make a listing jump off the page.

Other Important Stuff

We're at the end of this article, and have barely covered the surface.

For example, we haven't explored how social media sharing tools can help product-level URLs gain social mentions. Ensure your products and items have prominent, easy to use social buttons. But also be aware of the overall load time of those pages. Speed, at scale, is of paramount importance.

We also haven't discussed faceted and guided navigations. This is always a big SEO problem to surmount.

Unfortunately, treating this topic definitively would take writing a small book. Alas, that's both the curse and blessing of SEO, isn't it? It's always more complicated, and simpler, than we want it to be.

In writing this article, the paradox of SEO struck me as it has many times before. All the stuff we strive to accomplish, it's easy to say and hard to do. It's easy to make recommendations, and they can be so hard to execute well.

The simplicity of getting SEO right requires a great deal of complexity to accomplish. And therein lies the challenge, and unique opportunity, for the smart people working in our young but maturing industry.

This column was originally published in SES Magazine, Chicago Preview, 2011.

Want to learn more? Attend ClickZ Live New York March 30 - April 1. With over 15 years' experience delivering industry-leading events, ClickZ Live brings together over 60 expert speakers to offer an action-packed, educationally-focused agenda covering all aspects of digital marketing. Register today!

ABOUT THE AUTHOR

Adam is president of RKG, a data-driven digital marketing agency with leading service and technology solutions in paid search, SEO, display, attribution, social media, and comparison shopping.

Prior to joining RKG, Adam was president and founder of the boutique SEO agency AudetteMedia, which served premier brands including Zappos, Amazon, Gannett, Kroger, HSN, Charming Shoppes, University of Phoenix, Michelin, Wolters-Kluwer, and many others.

Adam has been active in the search marketing industry since the late '90s and is a frequent speaker at premier industry events including SMX and SMX Advanced, Searchfest, SES, BlueGlass, MozCon, and Pubcon.

He has been a regular contributor to Search Engine Land and Search Engine Watch, and has served as technical editor for Wiley/Sybex publications such as, "SEO: An Hour A Day". You can follow him at @audette.

Featured White Papers

Google My Business Listings DemystifiedTo help brands control how they appear online, Google has developed a new offering: Google My Business Locations. This whitepaper helps marketers understand how to use this powerful new tool.