How Accurate is Google Analytics?

Without a doubt, Google Analytics is the most popular web analytics tool in use today. It’s free, easy to set up, and fairly simple to use, so independent bloggers and the owners of small sites use it, and it’s powerful enough that many large websites don’t feel the need to pay for other packages. Millions of individuals and companies rely on GA figures to track the progress of their websites. They make business decisions based on what Google Analytics tells them.

Webmasters can track visitor numbers, popular and unpopular pages, time spent on site or on a particular page, returning visitors, traffic sources, and much, much more. However, it would be a mistake to assume that the numbers GA provides are perfectly accurate. Every professional web analyst knows that they just aren’t. They can’t be, for a number of reasons.

First and foremost, Google Analytics (and many other web analytics packages) relies on Javascript tags to track visitors. It’s estimated that somewhere around 10% of today’s internet users browse with Javascript disabled. That means that on average, one out of every ten is completely invisible to GA.

If you compare the visitor numbers tracked by GA to those tracked by your web server (and take care not to include visits from automated crawlers), there is almost always quite a significant discrepancy. GA underestimates visitor numbers.

There are also difficulties with the counts of unique and return visitors. For a start, it would be a mistake to assume that a repeat user will always use the same computer, and as GA has no simple facility for identifying individuals and tracking their logins, unique visitor counts may be greater than they should be.

Then there are sampling issues. Confronted with a huge dataset, Google Analytics, and to be fair, some other packages, decide to sample the data rather than examining it exhaustively. In most cases data sampling is statistically sound, but it does introduce further errors.

So, is Google Analytics 100% accurate? No way. But is it useful? Absolutely. In most cases, what webmasters and managers need to know is not an exact figure. It’s more about the change from one time period to the next. Let’s say you have a website and want to know how it’s performing overall. If GA’s estimate of visitors is 20% greater than it was last month, you can safely say that traffic has probably gone up by about 20%. You may not have an exact figure unless you go looking at the server logs, but you’ll know you’re doing pretty well.

Even if the figures shouldn’t be regarded as precise, GA is still a great way of tracking changes. A fraction of visitors may be invisible or slightly mis-identified, but as long as those fractions stay consistent over time, you’ll get good-quality information about what’s working well on your site and what’s not. You’ll still be able to track which sections are doing well, which landing pages perform best, and where traffic is coming from.

The best thing to do is think of Google Analytics figures as metrics, not exact numbers. ‘Visits’ should be considered a measure of visitor numbers, not the exact number of visitors, for example. That way you can work without ignoring the inaccuracies in the data, but still get meaningful, useful figures out of Google Analytics.

You make a good point. The exact numbers aren’t as important as having a general idea of how far your reach is and who’s stopping by. I have often become frustrated when I KNOW Google Analytics is not calculating visits accurately. I’ve noticed this especially with the “time spent on site” stat. I’ve learned not to trust completely, but it’s a great tool to help me see how I’m doing.

If you don’t like using Google Analytics you could always try Piwik, which is a free self-hosted alternative. There is also a Piwik Plugin for WordPress available which lets you view your stats right in your dashboard.

Hi Kimi,
What analytic program are you using? I find GA quite simple and it provides some helpful stuff like Adsense info, so I still prefer it to others.
Btw, I can’t access your blog posts now, it says “Proxy Access denied”.

Thank you for the article. But, I’m confused. You mention that GA loads using a Javascript, but GA tracks the number of visitors to your site that have Javascript enabled. How is it able to do that if those visitors are invisible to it?

I’m certainly not saying that you’re wrong – I’m just wondering if you could clarify. Thanks.

Thank you for providing such an informative article. I’ve been puzzled for weeks in terms of my website’s traffic. Using some plugins on WordPress my traffic is always much higher then what I would see on GA. I initially thought that there was no way GA could be wrong so maybe my issue was with the plugins. I soon realized that it didnt matter what plugin I used, the traffic was always much higher. I always excluded the spiders and spam and the numbers were still higher. Thank you for finally helping me get to the bottom of this.

@Noah – If a visitor with Javascript enabled on their browser visits a website with GA on it, it counts as 1 unique visitor. If a person with Javascript disabled on their browser visits the same website, GA will not recognize the visitor (and so this does not appear on your GA stats).

Thanks for the article. Although it would be really cool to know exactly HOW MANY visitors my websites have, not just a general idea. :( The search continues…

The only problem I have is with the avg. time of site. My bounce rate is pretty low (45%) if taken into account the standards of my blogging. However, the average time on site is very low and it is surprising how my bounce rate is low even though readers hardly spend a minute or 2 on my blog. I’ve tried my best to filter my visits and automated crawlers. Added to that, I get a tremendous amount of traffic from stumbleupon and the avg. time is again very very low and yet the bounce rate too is low. I don’t understand this at all. I really hope you can explain this?

Yes, that’s right, Aditya.
Traffic from Stumble Upon always has low avg time because its users are not targeted. If you use SU, you will see this issue as well. Sometimes you stumble on an irrelevant post and you will pass it in just a few seconds. Therefore, I always consider traffic from SU low quality.

That explains my query to some extent. Thanks :) My blog recently got a PR of 3 and traffic from organic sources like Google has picked up and the avg. time on site from Google is pretty high and the bounce rate too is low. I hope this will be able to cover up for the low quality traffic from SU. Thanks again!

You make some good points, but I often wonder why GA regularly reports rounded off figures, a little fishy but where I work it’s the only way of gauging change organic search changes at a level of detail.