Profile Information

Founder of Visual Itineraries, a closing tool/lead generation tool for niche and high-end travel agents. Co-founder of TheBigDay honeymoon registry; avid traveler, photographer, and still plays with cars and motorcycles. As of 2010, plays with airplanes too :-). Living the life in Bend,Oregon.

This is a story about Panda, customer service, and differentiating your store from others selling the same products. I'm going to use a real-live example that I suffered through about a month ago: I was looking for a replacement sink stopper for a bathroom sink. It did not go well.

From the originality of your content to top-heavy posts, there's a lot that the Panda algorithm is looking for. In today's Whiteboard Friday, Michael Cottam explains what these things are, and more importantly, what we can do to be sure we get the nod from this particular bear.

The mod_rewrite engine is a popular weapon for URL rewriting and redirection, but there are times (but not all times!) when it's easier, faster, and generally mo' bettah to tackle this in your 404 handler.

It's a well-known fact in the SEO world that Google shows enormous favoritism in its rankings to domain names that contain one or more of the keywords being searched for. If your domain name is a close match to the search keywords all glued together, it's as easy as fishing with dynamite to get on page 1 of the SERPs for that search phrase.&...

One of the best ways to build inbound links is to create an affiliate program. It's also a great way to drive real customer traffic from related sites.
But...don't just sign up for one of the big third-party affiliate programs--you'll get the customers, but you'll be throwing away a terrific opportunity to get great inbound links. ...

We all know the conversion benefits of tuning your PPC ad titles to match the exact words the customer typed in their search, breaking out what would be convenient "groupings" of ads with the same landing page into separate ads with separate, search-matching titles.For example, we have a honeymoon registry--but brides might...

We all know by this time about the benefits of converting your parameterized URLs to human- and crawler-friendly URLs, but the stock tools of the trade (ISAPI_Rewrite, mod_rewrite, etc.) don't necessarily scale all that well when you have a large number of categories, product pages, etc. I'm going to walk you through what it takes to code this yourself, and I think you'll find it's less scary and complex than you thought, and gives you a number of benefits in terms of ongoing maintenance, flexibility, etc.

Solid stuff here, Jeff! Especially happy to see that keyword stuffing didn't correlate with rankings. And this I thought was insightful: "Long-form content is arguably a byproduct of creating for quality".

Great "myth-busters" WBF, Rand! On the shared hosting topic: the one exception I'd make to what you're saying is if you're on shared hosting with a bunch of "rough" sites (penalized, known link farm, etc.) I would suspect that would be one of a long list of things that Google would look at in evaluating how trustworthy your site might be.

In my amateurish and slightly random testing, I think I'm seeing the image filename having way more impact than I'd expect. I also think I'm seeing uniqueness of images impacting ranking. In both of those cases, I'm talking about image SEO affecting the ranking of the page in the regular web results (not image search). I'd LOVE to see a test that examined the impact of image originality on web page ranking....in other words, if you use your own original image on a page, vs. an image that's found on 20 other sites vs. the typical cheesy stock photo image of a handful of people in a meeting room, sucked down from one of the stock photo sites and used on 500 websites.

Changing gears a bit, if you DO use your own original images, here's a link-building tip for you: do a Google reverse image search for your photos, and when you find others using them without your permission, send a gentle email to those webmasters asking that they credit you for that photo on their page with a caption and a link back to your site.

One thing I've often recommended to clients who want to show a popup signup for a newsletter, or a % discount, etc. is to keep a "pages visited" counter in a cookie, and only show the popup once the user has visited at least a couple of pages. Not only does it keep the popup out of Googlebot's eyes, but users are often more likely to respond to your popup once they've had a chance to see a bit of your site and decide if they like what you offer enough to give up their email etc.

When you submit an XML sitemap in Search Console, it's a hint/suggestion to Google that you've either updated that content or it's new. So, if you've got new articles in that sitemap, then that can be a good idea.

But that sounds like a lot of manual work to me :-).

I'd do something programmatically that pulled the latest 10 days worth of articles, generated a newest-articles.xml sitemap, setting the modification frequency to daily on all the URLs.

Any pages that are password-protected shouldn't really need noindex, unless there's actually a way for Google to find a link to them and get the content without logging in as one of your users. If that's the case, then you probably need to work on your login security :-).

For your PHP pages that have no HTML on them, I'd block those in robots.txt. There's no point in letting Google crawl those as they have no outbound links to send link juice to other pages on your site.

Break your sitemap into many smaller sitemaps. You can then look for sitemaps that have a low indexation rate, and then that's where your problems lie. You can then take THOSE problem sitemaps, and break them into smaller sitemaps even further, based on whatever hypothesis you have on why some of those URLs aren't getting indexed and others are.

I think what I would do is this: look at search traffic in aggregate to those product pages--try using URL patterns in Search Analytics in Search Console to see this. If you're not getting search traffic to those pages anyway, then I'd noindex them, as you're right....they may be dragging down your rankings for other pages on the site. If you ARE getting search traffic to them, leave them alone else you're cutting off traffic from Google.

Note that I believe that Google has some sort of overall site quality ranking factor that affects your best pages based on something like the average quality of pages on your site....I believe this based on what I've seen happen on clients' sites when they've pruned off a lot of thin content. But, I don't recall ever seeing any statement from Google backing this up, so it's just my gut feel based on patterns I think I've seen.

That's a great point, Arun. If the do this, not only could a savvy web developer see through this and get the PDFs directly, but it would also be encouraging Google to index those PDFs directly, so non-developers might get to them without going through the paywall...directly from search results!

For large sites, I recommend building internal processes for generating your sitemaps. Break your content down into various types, and generate a separate sitemap for each type. For my travel site, for instance, I have an XML sitemap for just hotel pages, another for travel specials, another for static pages, and a set of them (Yoast-generated for these) for the blog pages (only the blog part of my site is WordPress). It's a relatively simple thing to iterate over all of a certain type of record in your database and spit out the URLs for those types of entities, in XML sitemap format.

I agree with Shiv--break it into many smaller sitemaps. Google limits you to 50,000 URLs per sitemap, in fact. You should be generating your sitemap automatically, or at least on a very regular basis, from the actual content in your CMS.

For dynamic sitemaps, I don't know that there's a tool for that. What I have done is written database queries to return the values I need to figure out all page URLs for a given type, and then form the URLs the same way I'd form them on the web pages that list links to those pages....but instead, spit out XML in the sitemap syntax.

Hi Mario, I think I covered that pretty well in the Consistency section? I wouldn't use nofollow on a page unless 100% of the outbound links are to noindexed pages....otherwise, you're just throwing away link juice.

I wouldn't use just noindex for those, I'd make sure those pages are password-protected instead. Otherwise not-very-well-behaved bots and scrapers will still be able to see (and perhaps copy) those pages.

Important note with Yoast configuration: you MUST make sure that what you're including in your XML sitemaps aligns with what you're indexing/noindexing on the pages themselves. It doesn't do this for you automatically.

I generally recommend for e-com sites creating a bunch of separate sitemaps for similar pages. Note I said "similar" and not "related"...I wouldn't create a sitemap for all types of pages in one product group, for instance...instead, I'd create a sitemap for blog posts, one for all category pages, one for all subcategory pages, and then one or more for all product pages. You want to be able to see what types of pages are giving you indexation nightmares.

Good point. But if you have a small site, you might as well use the free version of Screaming Frog and let it generate a complete XML sitemap for you. Then you can tweak priorities, last update dates, etc. as needed.

Absolutely. In fact, this is an indication that you have a big problem with indexation, in that Google is finding and indexing pages that you don't think are important or potential search landing pages! Likely that means they're very light on content...and if Google ends up indexing them, then from an overall site perspective, Google is seeing the average content quality per page as lower than they should.

As an example, let's say you have a page for sharing a URL from your website. Let's say this page takes some parameter that indicates the page to be shared, and at the top shows the heading from the page and a snippet from the content, plus the usual form fields for sharing...just enough content so that Google does decide to index it. You're not going to put all of those pages in your XML sitemap, of course. If Google is indexing those, and you have 1000 pages of real content on your site, you've now got Google indexing 1000 good pages + 1000 share-this pages of non-content. And so Google will see half your site as pretty marginal content.

Good point, Joseph. Submitting (or resubmitting, if you've made a major update) a page in Search Console is a hint to Google that you think it's important and worth crawling before whatever would normally be in the queue to crawl from your website.

Category pages: Google appears to be less fond than it used to be of plain old category archives pages where there's an H1 heading and then a list of either products or blog posts. Fair enough: really all that page is is a list of links (and that's what Google wants to be!). Improving the content on a category page by adding an overview, some images or videos--that makes for a better page about that topic, for sure. From a UX perspective, many users just want to see the products (or blog posts) because they're familiar with the topic overall, and so often people will put a snippet of the overview up top and hide the majority of it initially, and supply a "Read more" link or button.

Definitely agree. If you have an HTML sitemap, and you're finding a lot of users are resorting to the sitemap to find what they're looking for, then this is a good indication that you need to improve your main navigation!

Agreed on the sitemap priority number. People need to understand that it's there for you to give Google a clue as to which of two or more pages about the same topic is the more important one, i.e. your category page about purple widgets vs. a blog post about purple widgets. It's not going to affect how your page ranks against pages from another website.

I'm a big fan of the Yoast plug-in, and yes, there's a page setting that allows you to noindex specific pages. They've also got some very helpful settings like noindexing subpages of archives, noindexing tag archives, etc.

Thanks Praveen...this is probably one of the biggest problems e-commerce sites have: where the very helpful UX gives you filtering, sorting, and user options that cause incredible numbers of variations on what really is pretty much the same page of content.

Great post Kate! Question for you (or anyone): at one time, using meta robots NOODP,NOYDIR had the side effect of making Google use your page title and meta description in the SERPs, instead of winging it in Google's own giddy way and fabricating the headline and snippet from bits of the page....kind of like telling Google "let go, I know what I'm doing."

About 1-2 years ago, I started seeing examples where Google would fabricate the headline & snippet occasionally even when NOODP,NOYDIR were specified. But I'm pretty sure I've seen recent examples where setting NOODP,NOYDIR caused Google to start using the client's page title and meta description (correlation isn't causation, and of course there were many other simultaneous changes to the pages!).

Has anyone done a recent test to see if NOODP,NOYDIR decreases the % of the time that Google makes up the headline and/or snippet for the SERPs?

One type of backlink that I think is worth calling out for disavowal is article marketing. In many cases when I've been doing a manual penalty recovery, and the initial request has been rejected, one of the sample links given by the Google search quality engineer has been an article marketing link. This seems like it's a hot button for them. I generally look for backlinks from domains that contain "article" (articlesnatch.com, ezinearticles, etc.), then look at the article itself, then Google one of the first sentences in the article (in double quotes for an exact match) to find all the copies of the article that Google has indexed, and disavow all of those too.

Actually I discovered I have another client as well who's seen a huge uptick right around March 8th, after fixing a problem with sitewide links from some partner sites. I think they moved up 40 places. And it's not a site where content would have made a difference.

I completely agree with Drlovecherry here. A disavow essentially tells Google not to BOTHER crawling those links since you don't want them counted anyways. Google only has to recrawl if you've managed to get links REMOVED from a site--then, they need to recrawl that site and see that the link is no longer there. Not the case with a disavow!

Certainly it's speculation. But the evidence has me mostly convinced it's the disavow, because of how many places they moved on average (21), and honestly their content is pretty thin, so it's unlikely to be a boost because of their content.

It also could be an ancient penalty that simply expired, and that happened to coincide with the Fred update. We are pretty sure they were penalized algorithmically (no manual notice in GSC) about 4 years ago.

I use Kerboo's LinkRisk to evaluate backlinks. It's about $3000/year though. But I find it's far more accurate than, say, Link Detox, which I found gave a lot of false positives, and let slip through a ton of really obvious spam.

Yes. I'm only ever looking at backlinks downloaded from Search Console. In my experience, Google finds more links than any of the other backlinks tools. And, I've never seen a manual penalty reconsideration request rejected based on a link that WASN'T in their list in GSC.

Many of Google's big algo processes have recently been integrated into the algo near-real-time. So possibly algorithmic link penalties got integrated as part of the Fred update.

Also, I've definitely seen a case where a site fell in rankings just a couple of days after submitting a really aggressive disavow file. So, clearly at least SOME processes read the disavow file within a couple of days.

Having seen an interminable amount of this site's content, I'll have to say I really doubt that any Google algo change worth its salt would give it a boost for its content :-p. It's just pages of a little text content and then links to government reference sites. UNLESS....Fred is all about giving giant boosts for outbound links to very trusted domains!

I agree, Marie...I've never seen a disavow remove a penalty this fast. I'm still on the fence, to be honest, about whether I'm convinced it's a penalty being removed by the disavow file. Of course, it COULD be that part of Fred is implementing a much more real-time consideration of disavow files.

If it was just Fred causing the traffic and rankings boost, I'd expect them to have moved a couple of places up across the board. But I'm seeing 21 spots jump on average. And, it's pretty clear they've been under a penalty for the past couple of years, as their traffic used to be about what it is now, and fell off a cliff.

If there's a penalty, it'll be dramatic...typically I've seen a loss of right about 40 places. It's been consistently about that number, which makes me think that penalties aren't implemented as a loss of PageRank in the calculation, but rather a push-down of a specific number of places (or completely gone from the results). But that's just a gut feel.

I agree with your approach. I'd expect that there are plenty of sites out there that are very close to the threshold of receiving a penalty (manual or algorithmic). It's a lot easier on your heart, your ulcer, and your client to be "ahead of the game" and keep the backlink profile relatively clean.

Having said that, there's certainly a risk of disavowing too much. There's a lot of links I'd consider spammy that are getting counted and helping sites' PageRank, and disavowing those will hurt the client. I tend to disavow just those that are (a) super weak...so they're not passing much link juice anyways, and (b) egregiously, obviously spammy in such a way that Google SURELY is categorizing them as junk. I.e. groups of big SEO web directories that have hundreds of thousands of listings AND you find they all look almost identical except for the logo; articles that have been republished on dozens of sites and originated in ezarticles.com or articlesnatch.com or one of those article marketing sites; or sites that Chrome spots as containing malware.

There's (not surprisingly) some discussion going on about whether this really was the result of the disavow, or whether perhaps 4-5 of my client's competitors all got hammered by the update, and THAT moved them up 4-5 places in the rankings.

I think that's possible...but, in Search Analytics, I'm seeing an average ranking for a big group of their terms go from 21 up to 7. And, their traffic fell off a cliff a couple of years ago, making me think penalty even more.

Very interesting and useful data here. Another data point I'd love to see is what % of searches that result in a 3-pack get a click in the 3 pack (vs. ads, regular organics). My inclination is that it's probably 70-80%.

Solid work, Russ...thanks for putting this all together! Despite some shaky data in GSC, I find it's tremendously valuable for keyword research--finding terms that a page might rank just off page 1 for, and with a little tweaking, we can push that page up onto page 1 and start getting some more traffic. I was surprised to find that the Index Status numbers are relatively accurate--wasn't it back in August when Google admitted that was broken? I guess it's fixed now. Now, if they would fix the initial "About xxxxx results" at the top of the SERPs. That is still often wrong by orders of magnitude!

Loved Pro Tip #2! I do this with a lot of my clients. I like working this way....I like to teach....and it's generally going to end up being good value for the client, AND grows the skills of their in-house person. Win all around!

Also, for phone numbers, a href tel: to make them click-to-call for mobile users.

Lastly, it USED TO BE true that setting meta robots NOODP,NOYDIR would force Google to use your page titles and meta descriptions instead of wildly pulling text out of your page for the headline and snippet to show in the search results (even though the original purpose was to tell Google not to use the DMOZ or Yahoo Directory descriptions, it seemed for a while that it had this side effect). Yoast has dropped the option to add NOYDIR (maybe they're just dissing the fatally wounded ex-giant Yahoo), but still keeps the NOODP option. Have you heard of any tests on NOODP that might show/not show an effect on Google not going all freestyle with your page titles & meta description? I certainly have a number of client examples where they're doing that despite NOODP/NOYDIR.

Yup, been milking this technique and getting tons of traffic for my travel site for about 2 years now :-D. In my case, my page is able to outrank well-targeted pages on TripAdvisor and USA Today websites despite having 1000x fewer backlinks, because they have minimal content on their pages for that topic.

Great WBF Rand! I'll add one thing that Amazon actually screws up: not informing you (at any point in the purchase path) about shipping METHOD, i.e. is it coming USPS, UPS, or FedEx. Why does this matter? Well, in rural locations like where I live in Bend, UPS and FedEx deliver to my door...USPS does not. I've had countless packages shipped here to my address that either go all USPS, or as far as Bend via UPS/FedEx, then they send it the rest of the way via the post office. Which then goes through a mail forwarding loop, and EVENTUALLY I get to stand in line for 1/2 hour and get the (#*@&$*(#@ package at the post office. Or, in the case of my steam punk costume accessories for this Saturday's Hallowe'en party...that forwarding loop means it's not getting here in time. *sigh*. Interestingly, about 3/4ths of the stuff I order from Amazon arrives via UPS or FedEx. I just have no idea with each order if this is going to be one of them :-/

Right...the kind of query is going to determine which factors get weighted more heavily, is what Rand's saying. And I'd expect Google is figuring this out by click patterns. As another example, think about a query like "Kenmore refrigerator part number 1234567". In this example, keyword matching is going to be VERY important, and freshness, domain authority, and engagement not so much.

Great WBF, Rand! The #8 is a favorite of mine. If it was true, then nobody would ever want a link from the NY Times :-). I can perhaps see, however, some potential for Google to look at the linking page to see if it appears to be discussing similar topics as the linked TO page. So in your Corvettes & coffee example, maybe on the Corvette collector site the link is in some mention of where the club is meeting on Saturday morning...so you'd expect to see "coffee" or "breakfast" or the street name or something like that on the Corvette site, so there'd be some overlap between the relatively infrequently used terms on both pages. If Google was doing something like this, it would help weed out a lot of blogroll links, directories with no business descriptions, etc.

I have a new myth for you (except, I don't think it's a myth): the average DA of sites linking to you (or maybe the average PA of the pages) might be a ranking factor. I've got a couple of clients (criminal defense attorneys) with pretty new sites, 1-man-band shops, and just the usual local business directory links (via MozLocal submission), but each has 1-2 links from stories in their city newspaper. And they're KILLING it...both are the 1st organic result for actual law firms (i.e. not Avvo or Findlaw or Yelp) for "criminal defense attorney". And in the 3-pack. One's in a city with a 1 million population, with 80 sites in the local results for that term, the other about 200,000 population and 40 criminal defense attorney sites.

And I'm pretty sure it's a DA factor and not a PA or PageRank factor, else my travel site would be slaughtered by the 400K links I have from Pinterest, and most local business sites would similarly suffer from the 10's of thousands of links that multiply like Tribbles from yellowpages.com.

Hi Sam, great experiment and write-up. I wonder if exact-match anchor text has anything to do with the results? I.e. some of the page you're looking at could have strong links with exact-match anchor text for some of the synonyms but not others? And that impact could be overwhelming the relevance metric for the given term, vs. the relevance that Google might assign for a synonym?

Terrific analysis Marie. Well done! Far too many people have been focusing on short term tactics. I've always thought the right approach was to do ten things to the site to make it a richer, more useful and engaging experience for users, knowing that Panda might only be able to detect 5 of them and give you credit.

Steve, when you were measuring links, were you counting only followed links? Because sharing a link on Facebook is going to result in a nofollowed link back from each share. So...let's say a nofollowed link is worth 5% of what a followed one is (surely NOBODY still believes Google when they say there's no link value in a nofollowed link!), something that gets 1000 shares on Facebook essentially got the equivalent of 50 followed links.

One quick note about 410 http responses: while it's supposed to indicate to the search engines that a page is gone forever, I have seen a case (6 months ago) where a client had a manual penalty based on pages that were returning 410 (they were created by site users and totally spammy), and in the manual penalty reinclusion request REJECTION, Google was citing those pages that had been returning 410 for several months. It was only when the client changed the DNS to remove the subdomain they were on entirely that the penalty was lifted. I'd have said this was a mistake in how Google's reviewers were handling this, personally. But...that's what I saw, so all be warned!

I've got another idea for how to create demand for sites like this....get Rand to do a WBF on it and tell everyone how cool it is! :-p

In all seriousness, this is a really useful post. I struggle with this with many of my more niche or new-product-category clients. I especially like the tactic of pushing the name of the nice you come up with out via press/news/outreach to get everyone to start thinking of your niche using that phrase.

Great post Rand! Liked the bit about using test PPC ads to verify those Adwords Keyword Planner nutty volume numbers.

Another thing you might mention is that with Google's improved ability to see and understand topics (vs. keywords and synonyms), you might expect a much flatter bell curve in terms of what terms can drive how much traffic to your site. For a particularly effective page on my travel site, my top term is only generating 8% of the overall traffic (out of 14,000 visits/month to that page), and it drops off steeply from there, with only about a dozen or so terms contributing > 1%.

Oh, and clearly you're now "in the pocket" of the Chow Fun lobbyists. :-p

Great post, Paddy! Google I think has always meant linking signals to be an indication that the masses are liking and talking about a site or a piece of content. Startups are asking, with Google's crackdown on scalable link-building, just how they're supposed to compete with the big brands. The answer, of course, is NOT to find a way for a few people to build masses of backlinks, but rather, to get masses of real people to discover and like your content. And yes, the big brands have an advantage here, simply because they're spending big money--often on traditional, offline-marketing--to get masses of real people to discover their content.

Love it! Curious...I didn't see anything about checking for many links from the same C-block. With many of the clients I work with--who've been naughty and they know it--I'll find they've bought into simple link farms, where the sites share the same C block (often, even the same IP address). Do you have the ability to spot/flag this in there somewhere, or if not, do you plan on it? Extending one step beyond that, detecting many links from domains with some of the same ownership info would be helpful (of course, Google being a registrar probably has access to some of the domain info that you can't get your hands on).

Great article, Rand--one eetsy beetsy comment about IIS and case-sensitivity in URLs in general. If you're using IIS, yes, it will ignore the case and still serve up mypeatyislay.html when you request it as MyPeatyIslay.Html....BUT really those are technically different URLs, and Google will treat them as separate. Which means if you have some links to it in 1 form, and other links in another form, then you're spreading link juice across two separate (and unfortunately duplicate) pages. Really, you need to 301 one to the other.

Great question--you can have additional content (Q&A, reviews, additional imagery) that you don't put in your product feeds. Sure, a few of the retailers may manually copy that content (especially the images), but most won't.

I suspect at this point that reviews and comments appear to be just additional text on the page, which should be good for Panda ratings. See this SerpIQ study on average number of words/page for page-1 results.

On the computerised drawing: there are a number of new hotels who've launched their websites with computer-generated imagery for the hotel, grounds, rooms, etc.!

Thanks Gianluca! I think that's a problem everyone has in the first few years. Building enough community and traffic that you can get sufficient volumes of user responses on most of your product pages takes a long time.

I'll have to agree loudly here too. I've got a client where dozens and dozens of sample links in the manual actions list are ALL returning 410 (gone forever) and have been doing so for months, and GWT still shows those examples. Sigh.