Google SEO News and Discussion Forum

There are always claims of false positives when Google does something with the algo. A lot of these claims are incorrect, the result of webmasters not looking at their sites harshly enough to see what the algo is seeing. I've never been convinced there were any false positives, until Panda hit people I know, whose sites were extremely unique and creative, nothing like "content farms." But they were, we agreed, perhaps a bit "over-optimized" - targeting hot keyphrases and so on, which we somehow thought might be part of Panda.

But supposedly that's not part of Panda, because now we know it's part of Penguin. Right? Or is that right?

When Panda came out, and we heard is was about thin content, I knew one of my sites was vulnerable. I was working slowly to improve the content, but my focus was on other sites so I was just waiting for Panda to catch up and hurt me. I had never been any good at link building, so all my links on all my sites are natural. On this particular site, my backlink profile was really weak because in that niche, people are very stingy about giving out free, natural links. I thought that might get it in trouble with Panda, too.

To my shock, Panda left my site alone, but Penguin got me - on my least optimized site. Yes, I realize I'm self-reporting and you have the right to be skeptical.

One possibility: my Penguined site is over 6 years old, so I have the sort of unsolicited spam links everyone accumulates over time, like updowner. I only have a handful of quality links, because of my lack of skills at link building of the white or black hat variety. Could Penguin be mistaking weak profiles on older sites for spammy profiles?

Or are Penguin and Panda intersecting and overlapping in ways we still don't understand? Why did people with unique and interesting content get Panda slapped, and why did my least optimized site get Penguin?

Why did people with unique and interesting content get Panda slapped, and why did my least optimized site get Penguin?

Mostly because our understanding of both algorithms is grossly simplistic. For example, Panda took over a year for Google to develop, and yet SEOs talk about as if it were an addition to the regular algorithm, and looking at just a couple factors in the normal way. It's MUCH more complex than that!

Penguin was built on a similar machine learning model, you can be certain. And we will not be able to reverse engineer those two, IMO. Find some loopholes, yes - but that's not reverse engineering.

Very true. But I do take at face value Google's claims about what each algo is trying to achieve: i.e., that Panda's about thin content/farms and Penguin is about aggressive SEO.

Using my site as an example because it's the only one I can get stats for and so on: my Penguined site was IMO ranking higher than it should have been, and I never understood why. Penguin really just corrected those rankings, which is fine, but also something I don't really understand because of Penguin's stated goal.

--Three years ago, Google was ranking all my content okay. Not too impressed, but finding it somewhat valuable on many queries. I concurred. --Around the end of 2010, Google suddenly decided that some pages from one of my categories were golden, and ranked them highly on some competitive keyphrases. The pages it had picked were among my most mediocre, IMO, and from a category that my visitors show only mild interest in (it's my third most clicked category by visitors already on site). This is the algo change that puzzles me most, grateful as I was for it. I hadn't gotten any great links to these pages. I believe the algo was responding to on-page factors, but I don't know which ones. --Penguin time, and suddenly all those high ranking pages stopped ranking so well. Again, I see this as a reasonable correction, as Google figuring out the pages weren't that great, and am only puzzled about why it happened during a "spam" punishing algo.

We can think of Penguin as demoting pages that were mostly ranking because of SEO efforts rather than their inherent value. I think Penguin looked for both positive and negative signals and then mixed them up into a single "Complex Cocktail".

Yes, but mine weren't ranking for SEO efforts - I didn't make any. They were certainly ranking for *something* other than inherent value, LOL, but it wasn't SEO. If I'd made some effort to SEO that site, I probably would've had stronger backlinks, and who knows, that might have saved it from Penguin.

I had sidestepped Panda by making improvements for users, but Penguin mistook all those changes for SEO attempts. At least, I guess that's what it was.

I responded....

Without taking this excellent thread off topic... what types of improvements "to please visitors" do you think might have sent false Penguin signals to Google? ....I'm not seeing how steps to increase user engagement within a site are likely to create signals that would have triggered Penguin.

Your subsequent description of what you'd done contained a comment that jumped out at me, but I was forced to leave it go at the time, both because of time pressures and because I really did not want to turn the Exit Rate discussion into an examination of your methodology. I feel though that such an examination would be appropriate here. Your comment (in msg#4470250) was...

--I combined some high exit rate pages into single awesome pages. This typically mirrored the common post-Panda SEO practice of combining "thin content" pages into longer pages and 301 redirected them, b/c as it happened, most of my high exit rate pages (back then) were rather weak.

What particularly struck me was your comment about using 301s. Inadvertent misuse of these can indeed send signals that look like spam to Google. Couple 301s with your goal of using them to create "single awesome pages", and I think you've highlighted an area that's worth further examination.

If the problem is Penguin (and I'm not sure if you've confirmed that), then possible over-enthusiastic application of 301s, even when redirecting pages within a domain, could well be the source of your trouble.

As for Google's fallibility... it's very possible that there were a fair number of false positives on Panda (though I myself haven't seen any), but I don't think I've heard of any false positives on Penguin. Not to say it isn't possible, or that you haven't been the unlucky one, but I don't think it happened very often.

Aha! You're talking about doing true optimization, not just knee-jerk obedience to old school "SEO" tips.

Well, that's definitely what I try to do. I read more marketing theory than SEO theory, and focus on targeting the right visitors and keeping them once they arrive.

If the problem is Penguin (and I'm not sure if you've confirmed that), then possible over-enthusiastic application of 301s, even when redirecting pages within a domain, could well be the source of your trouble.

Thanks for this suggestion. I redirected 10-15 pages this way - most people I've read speculate that this tactic is okay as long as you don't do it "too much" (whatever that is), so I kept it to a minimum. That said, volume might not be the only thing Google looks at with 301s. Maybe they felt some of the pages weren't relevant enough to the one they redirected, for example. I feel they were, and I feel I did the right thing by users, so if that's the issue, I guess I'll just have to ride it out, build other traffic sources, and hope that other improvements eventually draw Google back in.

As for confirming it was Penguin: all I know is, the timing was exact. I noticed a big drop on April 25th, so big it sent me here to see what was wrong. While I had stated I saw a drop on the 19th (date of the Panda update), I've just noticed it really doesn't look like a drop at all when I look at two months of my stats - but the 25th drop still stands out like a sore thumb, even at three months of stats.

I guess false positives are somewhat in the eye of the beholder. Obviously I did something that triggered Penguin. I just know that my *intent* was certainly not to spam or "aggressively SEO" Google or any other engine, which is why I would call this a false positive - even though I don't disagree with the changes in rankings. (Though I must say, some of the sites outranking me are worse or no better, LOL.)

Thanks for that thread, Zivush - I did read it recently, but had already done my 301s before that.

The reason I did the 301s was not for SEO purposes, but because some of those pages got enough traffic that I didn't want visitors getting 404s when their topic of interest was still available on my site.

Which brings up another point I didn't think to mention: I removed quite a few pages in 2011 - close to 40%. All those 404s didn't seem to hurt the site at all - it kept flourishing. Until Penguin anyway. Dunno if it would see 404s as spammy (I can't imagine why, but then I don't know everything spammers get up to).

I have 4/24/2012 noted as the Penguin launch date, so the 25th is in the range. For Panda, I've noted the April dates as 4/19/12 and 4/27/12. We do seem to be talking about Penguin.

I redirected 10-15 pages this way...

Please clarify this, as I can read it several ways.

Simplest way to put it would be (on the pages where you used 301s to consolidate) how many source pages did you redirect to how many target pages? It would also help to know what was your largest number of source page redirects to a single target page, and what the average number of redirects to a target page was. I just want to get a sense of what was going on.

In the other thread, you express lack of awareness of inbound links, btw, yet there's no point in redirecting pages that don't have external inbounds. What checking did you do on backlinks for these pages you redirected? Even though the bad ones may have been naturally come by and thus not your responsibility, it may appear to Google that you laid claim to them when you redirected them.

I just know that my *intent* was certainly not to spam or "aggressively SEO"...

I understand what you're saying, and I'm not arguing. Understand, though, why I picked up on "single awesome pages" as perhaps descriptive of a motivating force behind what you were doing, even though you're clearly also aware not to overdo things and to maintain relevance.

The rate at which many diverse changes happened could have also been a factor. Normally, this would only be a consideration when you also change domain names. Any chance there were too many traditional signals of optimization, done in the name of clarification for the user? I'm trying to get you in a self-critical mode, as I don't think it's useful to expend energy blaming Google for the problem.

In many cases, I tell clients who feel they can be cryptic that Google is only a machine and that it needs help. In your case, Google may have noticed too much perfection... too many factors brought into alignment... which I think is your theory. ;)

It may also be that some of the factors inadvertently involved redirection of iffy links. If so, it doesn't matter what your actual intent was. What matters is what it looked like to Google.

Robert Charlton, re: self-critical mode, I'm in it, and I'm not "blaming" Google, just trying to find the disconnect between my intent and its perception of me as a spammer (IF that's what Penguin necessarily means). I WANT to know if I'm making mistakes, so I can stop making themand make more money instead. :)

Answering your questions re: my 301 redirects.

--I had 10-15 301s altogether, some of which I deleted once I was no longer seeing traffic to the old URL. (More below.) --The vast majority were one page redirected to another page. There may possibly have been a single case of two pages redirecting to one, I can't remember for sure.

The way it came about was this:

I was deleting hundreds of pages because they either weren't great quality or they weren't so relevant to the site's niche (when I started this site, I didn't really have a plan). I was also rewriting a lot of pages to improve them.

But now and again, I'd realize I had two pages where one nearly duped the other. Say, I had a page on cleaning, painting and sanding widgets, and another page that went into great depth on painting widgets. In hindsight, this struck me as sloppy and potentially irritating to users, so I incorporated the in-depth painting details into the more comprehensive page, and redirected. Because I wasn't thinking about SEO, I did not check inbounds for those pages. I just made sure they were getting enough traffic - typically all from search or social media, going by my stats - to warrant a redirect at all.

And then I deleted the 301s as soon as I saw the old URLs weren't getting more than a handful of hits, and the new ones had "taken" in SEs and social media.

Now, you ask if I may have been seen as making too many optimizations. You're right, that is my preferred theory, though I'm totally open to others. I can see how some of the actions I took with the intent of improving the site for visitors could bluntly match actions "aggressive SEOs" take.

Another theory of mine, which I don't think I've voiced before, is that the mysterious reason those mediocre pages were ranking so well had to do with some spam-like tactic I was unintentionally doing by copying competitors.

I'm not "blaming" Google. I have long believed that the algo just can't be perfect, so you will always have some crap sites doing well and good sites getting buried, and maybe my site has been both at different times! That's why I chose to focus strictly on building for people, and developing other traffic sources. I just find it ironic that a penalty described as being against spammers can nab a site run by someone who was deliberately ignoring SEO and focusing strictly on visitors.

We can think of Penguin as demoting pages that were mostly ranking because of SEO efforts rather than their inherent value. I think Penguin looked for both positive and negative signals and then mixed them up into a single "Complex Cocktail".

I don't think there is a better summary statement about Penguin anywhere.

Another theory of mine, which I don't think I've voiced before, is that the mysterious reason those mediocre pages were ranking so well had to do with some spam-like tactic I was unintentionally doing by copying competitors.

MarvinH, I mean copy things they were doing, not copy their work. Like, if they were writing titles a certain way and doing well, I might start writing my titles the same way, only to find out later it was an SEO tactic rather than something that attracted visitors.

We can think of Penguin as demoting pages that were mostly ranking because of SEO efforts rather than their inherent value. I think Penguin looked for both positive and negative signals and then mixed them up into a single "Complex Cocktail".

Your second point, which I think is a great one, made me wonder if a website has good content and it is also over optimized, is the over optimization looked at more? Would a couple of areas of over optimization make Google say the content is good, but it is over optimized?

And then I deleted the 301s as soon as I saw the old URLs weren't getting more than a handful of hits, and the new ones had "taken" in SEs and social media.

You said you did not check the backlinks to these pages. Have you removed 301 before your drop? What I am getting at - is it possilbe for the drop to be because of loss of linking juice because you have removed 301?

I recently changed domain names - everything but the logo and URL stayed the same....

Immediately, I'm wondering whether the above post is asking about the same site we're discussing. Making the kinds of multiple changes you've described and changing your domain name anywhere close to the same time can cause Google to suspect it's being gamed.

This alone could have been the problem, though I have no idea whether this is the kind of thing Google might have waited until Penguin Day to act upon.

Another (remote) possibility is upstream links discounted on Penguin Day, having nothing to do specifically with you except that they were upstream of you, could have caused a drop. This probably would have required your having many eggs in one basket, and I don't know whether that describes your situation. (Having many eggs in one linking basket would likely also be a problem all by itself).

I changed domain names weeks AFTER Penguin. What I've mentioned in this thread is strictly the stuff I did BEFORE Penguin, and I have left nothing out. (I had been wanting to change domain names for branding purposes, but putting it off for fear of losing rankings. After losing the rankings, I figured no time like the present.)

As for aakk9999's excellent question, yes, I would've lost link juice, but I doubt those pages had much to speak of. They were ranking only on very long tail searches, and not in #1 positions even then. I didn't run them through a pagerank checker or anything, however.

Ironically, someone had suggested to me that having a lot of individual page 301s in place looked spammy to Google, LOL, which is why I deleted them. I'll leave in place those I still have in htaccess.

For a while I have been using keyword | keyword | keyword format in my titles which historically has always worked. I manage around 25 websites and generally speaking Panda and Penguin haven't caused much of an impact. All sites have good unique content (although most are commercial). The interesting thing is that one site does appear to be hit so I de-overoptimized the on-page meta which has made no difference, whilst another site that was previously link 5 is now link 1 and in its current over-optimized state.

So I wonder whether the content is being weighed up against over-optimization. The difference between the 2 sites is that even though they are commercial, one is a buying site (de-overoptimized, suspicion of Penguin) and the other is informational (overoptimized, recently promoted).

This to me suggests some weighting in favour of what might appear to be a content based information site vs what is clearly a money site as the information site is unashamedly over-optimized and the other isn't.

As a side note, both sites are built by the same webdevs, sharing the CMS, server etc and are pretty much in the same way.

I have two sites that are very similar (different niches) information sites. One got Penguined, the other didn't. On the non-Penguined site, I make significant income from a single affiliate link to a single product that appears on several pages, because I used that product to do what I described in those pages. On the Penguined site, I have carefully selected affiliate links which I believe are of interest and use to visitors (like, if you're going to talk about a book people might like to read, go ahead and put the Amazon aff link to it so they can see more reviews and possibly order it conveniently), but they're more varied and not necessarily representative of products I've used myself. I'd say affiliate links only appear in maybe 2% of all my pages, but most of them are not proceeded by language like "I bought the X I used to do Y at [affiliate link]" like they are on the other site.

While both of my sites are info sites, could it be that Google is seeing the info site with random affiliate links as a thinly disguised aff site, and the other site as more genuinely editorial? We know the algo looks at the text around a link - what if text indications that your affiliate link is genuine editorial and not "just" sales can earn you a free pass (all else being equal)?

To answer Tedster's question, the only thing I did with the meta was the title, which unfortunately is an annoying global title applying to all pages. I got the webdevs to make it flexible and pull in the category titles and apply them. But this is the ONLY change on-site. We also re-balanced the anchor profile, changes existing keyword anchors to brands. Still no recovery yet. Again, our other information (but commercial) site has not been touched.

Diberry, I think that is an interesting question. Contrary to the theory, I was reading a post by Aaron Wall on his private forums where it was suggested that weighting may even be given to linking out to quality sites. If your links are carefully selected high quality sites, it would be a worry to understand you have been penguined because of it. Because Google thinks you're trying to be smart.

And herein lies the confusion, we both have observed a situation where Penguin appears to be working against its own concept. I'm not always a skeptic but I wonder if there is a degree of misdirection in Google's official line, getting us all focused on de-optimizing our sites so that Google doesn't have to work harder to find those that ARE good at SEO. Kind of like, handing us a loaded gun and saying we're the enemy.

If your links are carefully selected high quality sites, it would be a worry to understand you have been penguined because of it.

This is actually an interesting point. *I* think I pick high quality sites to link out to, but I try to avoid Google's top ranked sites because my readers don't need my help to find those - I aim for smaller, less well-known sites that are creating stuff I believe my visitors will value. But I look at sites like a human being, not an algo. It's probable some of the sites I link to don't have much of a backlink profile, because they're just little people contributing to the web and they probably have never heard terms like "inbound link." So who knows how the algo might see it?

But OTOH, if I carefully only picked sites that do well in Google and have strong link profiles, then I would be doing aggressive SEO, wouldn't I? ;)

I'm not always a skeptic but I wonder if there is a degree of misdirection in Google's official line, getting us all focused on de-optimizing our sites so that Google doesn't have to work harder to find those that ARE good at SEO.

That's definitely a possibility - after all, they don't owe us the exact truth about what they're doing. They don't even owe us a public statement.

Tallon is talking about competitors that have gotten hit by Penguin, and says:

What they all have in common: Maybe less than 20% of their entire site content could be considered truly fresh or truly unique. It's mainly a regurgitation of what's already on the web.

Now, I'm not that bad, LOL. When I'm writing about a popular subject, I always try to add something unique to it. But looking at some of my pages, I can see Google thinking it's not unique enough. And if too many of my pages were "not unique enough", then it would follow that I would get affected by Penguin (spam) rather than Panda (thin content).

I've been wondering for a while if the point of Panda and Penguin had to do more with unique content than thin or even spammy content. Although, I'm still not sure about Ehow. Does it rank because even though it copies others, way more sites copy it? Maybe.

Since Penguin, I've deleted a lot of these types of pages, but I'm not sure where the uniqueness threshold is set. Rather than delete more, I think I'll have to start creating some more 100% unique content, and that's difficult to do in this niche because there's just so much written on it. Basically, the direction I've been taking since last year was right: looking at user metrics, and trying to improve pages according to visitor response. It's just that there's a lot of work yet to be done, and figuring out WHY visitors dislike a page is tricky and I don't always get it right.

I wonder myself about the theory of unique content, isn't everything we know just a regurgitation of another persons ideas/content?

Well, that's the thing - there's nothing new under the sun.

The thread I linked to was started by someone who has recently started seeing re-written content actually push his or her content up in the SERPs. It's worth a read.

I don't know what Google wants, but then again, my focus is more on what visitors want. I'm currently looking at the top sites for the topics of my pages, seeing what they say, and making sure I add a unique twist. It's like, a recipe is never unique, but the recipe combined story of how you cooked it the other night and what almost went wrong and how your guests liked it, accompanied by original photos you took, is suddenly unique. I'm not sure how to apply that model to other types of information, but that's what I'm working toward. Because I know it's not just Google that's tired of seeing the same info over and over - visitors want something different, too.