Getting computers to do stuff so we don't have too

SEO for Large Dynamic sites – II

Why did my traffic go up? – Keywords

The method Google uses (from what I have seen in my experience) to penalize you for a bad quality website is to reduce the number of pages it stores of your site. Google saves my pages? Well, yes and no. It does save a cached version so that users can see the page if your site crashes but thats not important. What is important is that Google saves a version of your site that is searchable. It finds all the keywords on your page and then compresses them and indexes them. This allows for the blindly fast speed we have come to know and love about Google Search.

Where am I going with this? Ok, so the larger the number of pages of your site which is indexed by Google, the more keywords Google has which are related to your site. The more keywords, the more users that will find your site, the more you are found, the more traffic that comes your way. (This is the answer to the question at the end of the last post SEO for Large Dynamic sites) This only applies if the pages have different content and different keywords, obviously.

So now knowing this you can see how having less pages index is a penalty from Google. To avoid the penalty you have to conform to Google’s guidelines. It is worth the read.

How do I know if Google likes me? – How can I found out how many pages are indexed?

There is two ways I have found that lets you know if Google is liking you more or less over a certain period.

First – Page Index

A number which many refer to as page index (explained here from an earlier post), as I explained before is the count of pages Google contains in it’s search index. When you search for something in Google you get an estimated number of results back. It’s cute and sometimes useful when you see only a couple results. It really comes into it’s own when you start searching for “site:wetware.co.nz” and seeing how many are estimated. This estimate is the number of pages Google is keeping record of. It’s how many pages it “estimates” it has in it’s index on YOUR SITE. You can see directly how many pages are searchable.

It is an estimate and I found it is never more than 90% accurate but it gives you a bloody good indication of what is going on.

Scenario. Your business is paying the team to add content to the website. Number of pages on your site goes up, way up. User’s come to claim their content, users go up. You now see a steady increase in traffic. Great! Now after a couple weeks of steadyly increasing traffic, it platoes. For a week or so you continue adding content but still traffic has stayed the same. Now traffic is reducing dispite adding new content, new pages. (BTW adding more content to your site will get you more traffic, provided the additional content covers more subjects and more keywords). Traffic is getting lower as you add more pages. This isn’t right.

The onion is making you cry.

That wasn’t some made up scenario, that is what happened to me. Luckly I had created a little rails app which was tracking that special little number. I had made it a few months before this all happened but hadn’t checked it in a little while. So I logged in and checked my stats. Quickly I saw exactly the same trend on my page index app. It followed precisly the traffic trend. It followed it up and then followed it platoe, then followed it down. I had found a mesaure which perfectly reflected the traffic on the site. I now had a more direct measure of Google’s impression of the site (Compared to traffic numbers, which are more susceptible to fluctuations).

With this I quick jumped onto Webmaster Tools, found all the 404’s and duplicate title tags. Fixed them all (took a lot of effort, 200k pages takes a while to get thru, even when they are all content generated pages.) About a week of fixing some of the problems I saw fluctuations to the page index count. A couple more weeks and it was rising again, so was the traffic! I managed to save it!

What happened?

The Bad

The rails app had some bad links which lead to 404’s

The rails app incorrectly encoded some characters which lead to url’s created which didn’t exists, more 404’s

The content on the site was subject to changes in the urls, leading to more 404’s and what was worse, external links to internal 404’s (harder to fix)

Removed duplicate title tags. For pagination just add “Page 1”, “Page 2” etc to all the paginated pages.

There is still a problem with permalinks changing (not really permalink then is it?) because content headers changing. Content titles changing etc. I didn’t fix this because it was a much smaller issue and was harder to fix. The fix btw would be to remember all changed content and 301 to the new urls from the old.

Second – crawl rate

The crawl rate of the GoogleBot can give you an early warning system to SEO issues which might affect you in the coming days. It takes a while for Google to react to changes and remove/add your pages to the index. I will go into more depth about this in the next post.