A New Google Filter is Born

In early December some astute webmasters noticed that some of their longterm (in some cases many years) #1 or #2 ranking pages in Google now rank at #6. Just like with the Google -30 and the Google -950 penalties, some people will maintain this is fiction, but too many smart people experienced the same thing at the same time for it to be such.

Other searches that returned the same url at #1 may also be sent to #6, but not all of them

Some reports of a #2 result going to #6.

My Site That Got Hit

My site which saw a ranking dive on December 18th had the homepage hit, and interior pages hit for some (but not all) related phrases. Here are some noteworthy conditions with my site that was hit:

The site was entirely ranked on SEO. There is no ad budget outside of PPC ads or link buying, and no brand recognition outside of the search results. Outside of one linkbait there is nothing remarkable about the site.

The homepage did not get any new quality links in over a year.

Much of the link building was done years ago when I was far spammier and far more aggressive with anchor text than I would be today, though I did use some semantic variation to pick up rankings for many different keyword permutations.

The internal pages still rank #1 for some semi-related longer queries, while they are also filtered and ranking #6 for some more obviously connected shorter search queries.

The site continues to buy PPC ads and gets decent conversion rates for the keywords that were hit, and gets great conversion rates for more focused related terms, some of which the site was hit for and some of which the site still ranks great for. This conversion data is being sent to Google via the AdWords conversion tracker.

This affected alternate permutations of acronyms (letters strung together or pulled apart).

This did not affect obvious domain name or brand related queries, even if the brand contained one of the words overlapping with the penalized set. If a filtered word outside of the domain name / brand name is appended to the query then the rankings are killed, and the site is stuck at #6.

Usage Data or Improved Phrase Relationship Detection of Anchor Text?

Why I do Not Think it is Usage Data

Based on feedback in the WMW thread it is hard to isolate this to any one variable with certainty. Two possibilities that have been thrown out are rolling more usage data into the search results or a better understanding of word and phrase relationships. It is easy to think of usage data as a possibility given my site's lack of marketing and lack of integration into the organic web, but that would not explain why some pages and queries were hit while some similar pages and queries still rank, with Google getting strong conversion data via AdWords on some of these pages. Also, for that homepage I wrote an aggressive page title and meta description that draws in many clicks, and the landing page is exceptionally relevant for the query.

Why I Think it is Phrase Relationships

I think this issue is likely tied to a stagnant link profile with a too tightly aligned anchor text profile, with the anchor text being overly-optimized when compared against competing sites.

The fact that some related queries were hit, but not all, makes me think that rather than being about usage data this is about word and phrase relationship improvements. I think if Google got better at understanding word relationships, many of the pages that once fit the criteria to rank may now have anchor text that is too focused and too well aligned with the target keywords, especially if they compare your anchor text to the anchor text of other sites competing for the same phrases. Once possible manipulation is identified via artificial anchor text your rankings across the site can be suppressed for a basket of semantically related terms, as noted in some of Google's phrase based indexing patents.

Matt Cutts Does Not Know What Happened

This filter was also called the minus 5 penalty, but many of the sites that were hit still rank at #6 even if they were ranking #2 or #3 before they were hit. When Barry posted about this Matt Cutts said "Hmm. I'm not aware of anything that would exhibit that sort of behavior," but some past SEO issues, like the famed Google sandbox have been accidentally introduced as a side effect of Google upgrades:

What's a sandbox, Matt?

"Some people have asked, "does this apply to newer sites?" Essentially, the way to think about it is, around 2003 Google switched to a new method of updating its index. Before that we had monthly Google dances. So as a result, new data is always being folded into the index. It's not like there was one pivotal moment when anyone can say, "Hah! This is the change!" In fact, even at different data centers we have different binaries, different algorithms, different types of data always being tested.

"I think a lot of what's perceived as the sandbox is artifacts where, in our indexing, some data may take longer to be computed than other data."

Your Feedback Needed

With my sample set of one site my current hypothesis might be out to lunch. If you have any sites that you feel were hit and want to share them for helping everyone figure out what is going on please do so in the comments below. If you have any ideas or feedback on what happened please leave a comment with that too.

Comments

Hi Aaron,
I've noticed this with one of my sites as well and, as far some of the discussion goes, a lot of people seem to be saying that this is just testing.

My site has several old backlinks (with focused anchor text), however, none of the on-page factors are aggressive by any means.

I was discussing some of the upcoming issues surrounding Knol with a Googler and asking, "Since the model is: I make a 'Knol page' with AdSense on it and make money, in order to make money I have to rank right? So how are you going to work the Knol pages into the current rankings to give me value without compromising the SERPS?"

Googler's response: "Well, we'll just have to seed the pages with rank..."

Short of it is they are willing to compromise long standing results...
Now maybe this is not just for Knol's sake, but for the sake of recent/mixed content. I'm sure that blended search/Knol has a part in it though.

1. One year old website
2. Niche terms with low competition and been number #1 for 2 terms for more than 6 months
3. Other more competitive term on #6 for last 3 months

In mid december my #1 got to around #6 position but fluctuating sometimes back and sometimes around #6 and sometimes even going to positions #30-#40.

Conditions

* The site was entirely ranked on SEO. No PPC budget and no brand recognition
* Site was still getting some back links but the quality could be questionable - paid links but relevant
* All 3 terms that I was ranking for had lots of links with the same anchor texts and only small variations were present
* All the traffic went down, not only these 3 terms. Also my brand name - which is generic name ranks on #6
* I am using Google Analytics and other google products heavily. The site was interlinked with other of my sites but these have not been penalized.

Good post... One of my sites got hit too and I dont know why since I pretty much do the same seo stuff for all my sites and only one got hit. My site doesnt even rank for my domain name, it only if I put it in quotations that I rank in page 3. Totally blows... I was ranking for some pretty good keywords in my industry, and interestingly enough, the same day Google unleashed its wrath at poor-itty-bitty me, Yahoo decided to give me some love, where I was not ranking that well.

The only thing I think could be a result of this is that my site was listed on several of the directories that got hit in early Q4. I have also noticed that my site has been picked up by some spammy scrapper sites (something I dont have control of, do I).

I have definitely seen huge changes for a couple of sites, including one beneficiary and one that has been slammed very hard. Much harder, in fact, that your site has been. They moved from 1-2 for a core phrase to #11.

I haven't taken the time to analyze as thoroughly as you have, but I think you are correct that Google's algo change punishes sites with optimized external anchors.

I would be happy to share the ranking data for both sites offline if you want to add it to your analysis.

I've been seeking a cure for this penalty for weeks now and still can't find anything that remotely causes it. One side of me wants to point it to being overly optimized, non-varied anchor text but then I'll find pages on my site affected that aren't optimized at all.

If anything Im leaning towards user behavior being analyzed by google. The pags that haven't been penalized have a much longer "visitor time on page" but even this doesn't align with why my competitors haven't been hit with the penalty.

For my site that was hit the penalty was not page specific so much as keyword specific...ie: my rankings cap at #6 for some keywords, and then the same page ranks at #1 for slightly different related keywords.

There must be slightly different filters and variations I guess. Maybe your pages haven't been hit as hard with the penalty.

My affected pages won't show above #6 no matter what. Allinanchor, allintitle, searches in quotes, and even a "keyword keyword mysite.com" search still displays the page at #6, with exception to the root domain and homepage title search! It's quite amazing and frustrating. I mean if someone was looking for your page by typing in a keyword and your domain, you'd think Google would respect that as a legitimate effort to find your sites page and show it in position #1-3 at least.

If it is a filter and not just a bug, they've got some major work to do on it. Now that the holidays are over I'm sure more webmasters will be noticing this new found penalty/filter/bug whatever the F#$* it is.

I dont think my case is b/c of anchor text, because I always have varied link back anchor text. I vary my anchor texts between my website name, brands and products that I sell.

The only thing I can think G doesnt like is that I am active on forums (with my signiture at the bottom of every post) and I also comment on blog posts.

Other than that I dont know, but like I mentioned in my previous post, my site has been picked up by a lot of spammy scrapper sites... I think it has to do more with this than with being active in forum s(i could be wrong, if you know otherwise, please let me know), but how do we prevent these spammy sites from linking to ones site?

As a clarification Aaron, links worked the first time around for a site (I didn't buy them, I just pointed some links to it from sites I play a role in...might want to update that); it has not yet worked for the second or third instances for different sites, though there is 10 days left to match the old pattern.

I did note later in the thread that I had a little fun with a pseduo-competitor in a different industry, bumping his site from #7 to #5 once I noted that another site I played with went from #7 to #5, ignoring the locked #6 position.

I can't yet agree with you on the over-optimization for phrase, when one of the sites I see hit doesn't even use the phrase in the title (but does in root domain, metas, and intext). I'll be watching this closely.

We optimize a lot of sites and yes some went up and yes some went down but I haven't seen anything that would indicate some kind of #6 filter. I know that isn't a conclusive test but that would be one of the stupidest filters imaginable and I doubt it's done that way.

But that's just my opinion.

Now, I could see a "let's make the criteria slightly different for above-the-fold sites" (take clickthrough rates more into account, etc.) perhaps but some sort of a -5 penalty would be stupid.

I can verify that I also had a site which ranked 1st page for hundreds of search terms for almost 1 year tha has now been all but deindexed. I dont know whay, I have sent a request thru webmaster tools...

Are there any resource beyond the webmaster tools, are google defined penalties written anywhere or are these just webmasters best educated gueses?

I am experiencing the same problem in the last 2 months at my company, a fortune 50 retailer. We ranked #1 for several top 1000 single word keywords that we now rank #6 on. I wouldn't mind running some theories by you, that you don't have listed. I don't have your email so hit me up at [brandon at holeyduck dot com] if you want.

Suspicion would be too much of the wrong form of link formats - ie, in content (editorial links) running fine as opposed to sitewides, sponsorships, and general paid link formats as a potential flag in too unbalanced numbers.

Also wondering if this is going to show specific to established sites.

Another pointer - potentially keyword specific or else there is a phrase relationship, as some keyword areas do not appear affected. Will look more carefully at these over the weekend.

I had a site with PR 6. Mostly inactive for the past year, certainly from a link building perspective. It's recently dropped to PR 4, though I haven't checked rankings. That said, it targeted various keywords mostly with individual pages, and those seem to still be pulling in the same traffic. Which it should - I did original, academic research for school and then published it on the site.

I can find one term that I previously ranked #1 for, i.e. "Free Directory List" that I am now sitting at #6 for. Funnily enough that term is a by-product of longer terms that I was attempting to rank for a long time ago, namely "SEO Friendly Free Directory List". Other parts of that term are not affected (yet).

So "SEO Friendly", "SEO Friendly Directory List", and "SEO Directory List" all still rank as per normal.

However, the "Free Directory List" term is a very generic term. All 3 words in that phrase are highly generic and probably used across a range of sites. There are 21.5 million results returned for that term, but I am sure that years ago that was more like 350 million.

There was another term "Free Web Directories" that I ranked at #1 for prior to the Google paid link penalisation, and haven't ranked higher than #6 for since. It might be a bit far fetched aligning that with this phenomenon though.

Customerstreet is a network of directories that show business listings in the UK. We have noticed that some of the newer pages that were added in December dropped like a stone December 15th, saying that our stats overall for the directories are now the highest they have ever been.

We are looking at about 10 million page views a month so we are not too concerned but would like to figure out why this has happened to certain pages as we are looking to push our brand on directories to the next level in 2008. I'm glad it was not just us that have noticed this pattern. Its not what we have done then, you win some you lose some. We want to win !!!