Updated: Exclusive: A Look at Google Ad Planner Data Vs. Comscore

When Google Ad Planner came out back in June, I immediately thought of Comscore – and I was not alone. Many in the marketing industry thought that Google's product would be a "Comscore killer," and when I noted as much in my coverage, Gian Fulgoni, Comscore's chair, shot back…

Hi John: Before celebrating the availability of these products from Google, I think it would be prudent for web site operators to compare their site traffic numbers as obtained from their server logs (or Google Analytics for that matter) with the unique visitor numbers that Google is now publishing through Google Trends and Ad Planner. I think they will be astonished at how much lower Google now says their traffic is.

But until now, we’ve not had the data to back Gian’s claim. I asked him if he could provide it, and to his credit, he did. The story it tells is certainly not what one might expect. (Of course, the data is from Comscore, so it must be taken as such, but remember, Comscore is a public company that stakes its reputation and its market value on data, so my gut tells me that Gian is not trying to pull the wool over anyone’s eyes.)

A bit of background: Anyone paying attention has noticed that publishers, by and large, believe Comscore’s panel-based measurement system grossly underestimates traffic and unique visitors. As a publisher myself (FM represents more than 160 middle to large sized sites, including this one), I’ve been one of the most visible such complainants. And that list is not short. In fact, Comscore and Nielsen are both working with the Interactive Advertising Bureau (I am a board member) on an audit of their practices to verify their methodologies. (Comscore notes that it believes the issue of cookie deletion can cause significant inflation in unique visitors, for more see this release.)

Given this, the world expected that Google, with its unparalleled access to web-wide data, would validate publishers’ concerns and show that Comscore’s numbers were significantly under-reporting reality.

Turns out, the reverse is true. Gian provided me data comparing Google Ad Planner and ComScore data in two cases. First, for a large sample of 20,163 sites, his shop compared reporting on monthly uniques between the two services. Secondly, Gian pulled out 5,398 sites that are part of the Google Adsense ad network, and ran the same comparison.

The results are pasted in these two charts (provided to me by Comscore):

What to make of the numbers? First off, it’s quite interesting to see that Comscore measures, on average, a significantly higher number of uniques across all types of sites. Comscore’s numbers are three to three and a half times higher, according to Comscore.

Secondly, for sites that are using Google’s Adsense network, the undercounting is not as dramatic (that’s the second chart.). As Comscore’s charts note, there seems to be a “significant bias in Google Ad Planner data” toward “sites that carry more ad impressions from Google.”

In short: If you were a media planner using Google Ad Planner, and you were looking for larger sites, you would be led to sites that are running Google AdSense, on average, over sites that do not. Net net: This data indicates that Google Ad Planner pushes ad dollars to Google sites over non-Google sites. This makes sense – Google has data on Google users, after all. So that data might naturally bias toward Google-related sites.

But as I said in my coverage: “Such a tool must be neutral and not bias advertisers toward buying on Google properties or those that have Google ads, which of course is going to be a perceived bias in any case. Such is the price of being Very Big.”

So far, not so good on this measure. As Gian and Comscore have long pointed out to me, it takes more than raw data to make for good measurement. Ideally, you weight your data with a lot more knowledge of its context – what kind of machine is creating it (work or home? Man or woman? etc.). While Google once blended Comscore demographic data into its ad network, Comscore confirmed to me that this is no longer the case. And while it is subject to endless criticism, Comscore does have a lot more practice at this game than does Google. At least for now.

This data once again raises the question, long asked, of how Google is measuring in the first place. Most believe Google must be leaning heavily on its Toolbar data (see TC for more here and Danny here), and this data does nothing to counter that argument. The strong bias toward Google network sites is suspicious – one can imagine that folks who might install the Google toolbar are clearly already biased toward visiting Google-related sites, for example.

But Google will not acknowledge any use of the Toolbar. Instead it said in its announcement: “Google Ad Planner combines information from a variety of sources, such as aggregated Google search data, opt-in anonymous Google Analytics data, opt-in external consumer panel data, and other third-party market research.”

As I pointed out earlier, I don’t think such coyness can stand. I’ve pinged folks at Google to get a response on this, and as soon as I do, I’ll update this post.

UPDATE: Google has provided a statement to me:

We take the objectivity of Google Ad Planner very seriously in providing advertisers and publishers with a better understanding into online audiences. While we don’t comment specifically on our data collection methods, Google Ad Planner in no way treats AdSense sites differently than non-AdSense sites.

The odd thing is that the monthly Unique in the Ad Planner Audience main listing are plain false (3 to 5x less than our real figures, with no filters and all countries included) when the Google Trends for websites and trafic charts in Ad Planner (the same) are very accurate (2% error margin on our 40K daily unique website, compared to Analytics data, verified over a 2 years period).

I just don’t get how Google can be so accurate on daily uniques over the past years and be so far from it on the monthly unique and Pages views (3x error for us)

To be honest Google has long lost the unbiased attitude that they once demonstrated. This is not the first time they are pushing up something. They have pushed Google Checkout Merchants in the search results for a long time now.

A more recent example is pushing knol results on top of results from more established and higher ranking sites. As a reference try this search.

Knol results would be fed to you ahead of allrecipes.com which has a PR 7.

John, we at TMCnet have also noticed Google grossly under reports page views and unique visitors. Like you say, they should be more forthcoming with their measurement methodology and in addition website owners need to realize that it makes more sense to trust numbers from companies that do not run an ad network.

I’m not sure this tells us anything with significance. The trending for GG ad network sites is all over the place, with a ratio from .3 to .8 – not enough to say with certainty Google always overrepresents their ad network.

Also, are we assuming Comscore is right? How about throwing Nielsen into the mix as well? Comscore seems to historically trend higher than Nielsen; it’d be interesting to see the comparison with Google vs. Nielsen data.

Counting numbers correctly is a critical issue because it represents the measurement of revenues for online publishers. It boggles my mind why we still have so many different results when we can measure so much, so easily.

We need one good measurement tool on which everybody agrees–why can’t we get that?

Where is the love for Quantcast? Out of all the different 3rd party tools available (Comsore, Neilsen, Google AdPlanner, etc), they are the only ones that come even close to accurately reflecting our traffic as reported by our internal analytics (Comscore and Neilsen are about 50 percent of actual, Google is about 30 percent of actual).

This makes sense given that Quantcast is the only tool that combines publisher data along with panel data from over 1MM people.

I am not saying that our analytics program and Quantcast exactly match up. They do not. But at least they are a heck of a lot closer (within 10 percent). Furthermore, there is transparency. As a publisher, it is a lot easier to accept their numbers knowing that they are able to accurately track each visitor to our site, then use their tools to deliver a people count (which takes into account the cookie deletion issue).

Publishers should realize that a company like Quantcast that uses publisher along with panel data represents the best opportunity to have a 3rd party accurately reflect true Web site traffic. It is probably why CBS, Time, and many other sites have agreed to become Quantified.

The thing that strikes me about this whole discussion is that outside of the industry coming together to agree to open reporting standards similar to the ones that created the internet itself, #’s will always vary from one reporting service/application to another.

As Darren notes about using 3rd party services, unless a non-partisan body (maybe the IAB, maybe not, I don’t know) comes up with standards that everyone agrees to online #’s will continue to sprawl wildly all over the place.

Recently (and to the point of this article) I’ve been using Google’s tool, Quantcast’s and in my career have used any number of others (the @plan’s of the world, Omniture, WebTrends) and not once have the #’s lined up between these services. Ever.

And ultimately, outside of some non-biased 3rd party holding everyone to the exact same standard, I don’t think they ever will.

“To be honest Google has long lost the unbiased attitude that they once demonstrated. This is not the first time they are pushing up something. They have pushed Google Checkout Merchants in the search results for a long time now.

A more recent example is pushing knol results on top of results from more established and higher ranking sites.”

Bilal, I have to debunk your assertions. We certainly do not give websites that use Google Checkout any kind of boost (or penalty) in our search results. I’ve already debunked the “knols inherit more trust and authority because they are on google.com” issue in the comments over here: http://sphinn.com/story/61219 . In fact, when I do the specific search that you mention for [buttermilk pancakes] I see allrecipes.com at #1 and #2, and a knol page doesn’t occur until result #6.

As a Google software engineer for 8+ years, I can assure you that all of the claims in your comment are incorrect.

at this moment, searchengineland.com comes up #1 for miserable failure (“google” is still the most frequently occurring text/string on the SERP [well, with 19 occurrences it’s right behind “the”, which has 20 occurrences]):

Nate, if Quantcast data are within 10% of your site’s server log data then I really have to question the accuracy of their data. Cookie deletion alone will cause a site’s server data to over-state actual visitors by far more than 10%.

Stu — this issue of cookie deletion is what that ComScore uses to explain discrepancies in a publisher’s server log data and their data. And based on their study, there is no doubt that it exists and has the ability to widely inflate a site’s internal numbers.

But the type of site and demographic will play a huge role in the actual impact of cookie deletion. I doubt there are many serial cookie deleters in the 50+ crowd compared to the 20-29 year old crowd.

As a publisher, it is probably the single most frustrating part in dealing with ComScore and Neilsen. They do not have transparency. Instead, they hide behind the cookie deletion argument.

I think the other point that ComScore highlighted in their study is that panel-based measurement is needed. They are right, because without it you cannot know if that one person is actually being counted 10 times.

But it has to go beyond that. That is why I applaud Quantcast for taking both panel data and publisher data to determine their numbers.

Furthermore, they are completely transparent. As a publisher (or advertiser), you are able to see both the visitors based on cookies and those based on actual people (once the cookie data has been stripped out).

This allows everyone to compare apples to apples with visitors and then see the impact of cookies.

Is it accurate? — With a panel of over 1MM and 10MM directly measured sites, I have a lot more faith in them than a panel-only service. Furthermore, they have actual site data. It’s like what is going to be more accurate – counting heads that come into a stadium or determining that headcount based on cars in the parking lot then applying some mathematical formula.

Plus, if you look at some Quantified sites and see the impact of their cookie deletion, it makes some inherent sense. Sites where you would expect to have higher return visitors (dating sites, news sites) show a lower percentage of vistors/cookies compared to sites that generate more traffic from search. eHarmony as a visitor to cookie ratio of .65 while answers.com has a ratio of .87.

Google could do the same thing by having Publishers that use Google Analytics opt-in to allow this data to be used. At the end of the day, Publishers (and advertisers) just want accuracy..

Why not conduct a study comparing the accuracy of ComScore, Neilsen, Quantcast, and Google? ComScore has opened this up by trying to discredit Google’s Ad Planner. And as a start, why doesn’t ComScore provide the same data for Neilson and Quantcast that it provided for Google AdPlanner.

Of course, none of this would be a big deal, except for the small fact that millions of dollars hinges on the accuracy of this data.

Nate, I think there are two problems with the approach of tagging sites a la Quantcast, the biggest of which is that there is absolutely no way that the majority of sites are ever going to participate. Sites’ concerns about confidentiality guarantee that will be the case. Plus, as in the case of Google not being willing to include their OWN sites’ data in Google Ad Planner(now, that’s a hoot, isn’t it!) because a site that’s also a public company will fear that cooperating with Quantcast could be viewed by the SEC as “providing guidance”. So, we’re left with a situation where some sites will be willing to be “quantified” and many will not. Which, in turn, means that Quantcast’s data will be biased toward sites that are cooperating with them. What this all means is that, because they need an unbiased set of data,
there isn’t a prayer that ad agencies will use Quantcast. Advertisers want unbiased data used to make decisions as to where to spend their money and will insist on their ad agencies doing so.
While I’m on the topic of Quantcast, allow me to address your comment that they’re transparent. I don’t agree. First, Quantcast acknowledges they use a panel. But they refuse to provide any detail on the source of that panel. Transparent? I think not. Then, they slam panels as not being able to measure smaller sites — while at the same time saying they use panel data to adjust all sites’ server data for cookie deletion. WHAT??
Finally, let me address your point that serial cookie deleters don’t exist in the 50+ crowd like they do in the 20-29 crowd. That may well be the case, but to think that cookie deletion among 50+ people only leads to a 10% inflation in site server logs is, I think, unrealistic. I’m in the 50+ crowd and I regularly delete my cookies. And I know many 50+ people who do the same thing. Some of them even set their browsers to do it every time they log on. We 50+ year olds aren’t as technically illiterate as you would like to believe. And, let me point out that it’s not just the percent of people deleting cookies in a month that inflates a site server’s ability to accurately count unique visitors. It’s also how often they delete their cookies. The more frequently a cookie is deleted the more often a single person will be counted as multiple different visitors. If I recall correctly, the ComScore study showed that 30% of all Internet users deleted their cookies in a month and did so on average 4 times. That can create a 150% inflation. Makes it hard to believe that your site’s logs are only inflated by the 10% you claim.

Azdirici:
There’s a problem with your logic. At the end of the day, all advertisers certainly want accuracy. Unfortunately, however, some publishers want the highest audience numbers they can justify by any means possible.

Re — “So, we’re left with a situation where some sites will be willing to be “quantified” and many will not. Which, in turn, means that Quantcast’s data will be biased toward sites that are cooperating with them.”

This is the issue. If a site is not “quantified”, is there a negative bias toward that site. Or because they use both panel and publisher data, is their entire dataset more accurate? Both are valid arguments for which you or I do not have an answer.

It seems though that based on the data that ComScore was able to provide for this article, either ComScore or Quantcast could provide similar data. If I were Quantcast, this would be one of those pieces of data I would want to publish because it would go a long way to building trust that they are an independent, trusted source of information.

Re sites logs only inflated by 10 percent – Maybe a little clarification will help. This 10 percent is when comparing Unique Visitors by Cookies (Quantcast) compared to Unique Visitors by Cookies (per our Analytics program). Quantcast’s number differs by 10 percent (on the low side). A second Quantcast number is provided which is People. This number is about 80 percent of Quantcast’s UV number by cookies. In other words, according to Quantcast, about 20 percent of our Unique Visitors are actually the same people. If we compared this People number to our own UV numbers, the actual People (per Quantcast) would be about 75 percent of our UV reported numbers.

This is all very transparent for the publisher. We can compare at least some of our data to theirs. Can we perform the same or similar process with ComScore or Neilsen data? The answer is no. From the publisher standpoint, this makes it a lot harder to trust the validity of their numbers, especially when there is such a big difference in numbers and because we fall into this new, fast-growing, smaller site category that panels have been criticized in the past for underestimating.

Nate, I think you’ve answered your question yourself. If, as you’ve repeatedly claimed, you really believe that tagging a site significantly increases Quantcast’s people UV estimates, then it followes that a site that doesn’t permit tagging will be significantly undercounted by Quantcast relative to tagged sites. This is a killer for an ad agency because if the agency then uses Quantcast data to make decisions about where to place advertising they will be led WRONGLY to the tagged sites that have relatively higher UVs. The agencies’ clients will then scream blue murder. That’s why we never see any agency bring us Quantcast data. Quantcast might be a player in the Little Leagues but they are nowhere in the majors.

On a related note, if Google Ad Planner is biased towards adSense, what about their search results? What if I had statistical evidence that Google search results are more likely to run adSense ads then the ones from live.com? Search is a little more important to our digital lives than Ad Planner and faces much less competition.

It seems though that based on the data that ComScore was able to provide for this article, either ComScore or Quantcast could provide similar data. If I were Quantcast, this would be one of those pieces of data I would want to publish because it would go a long way to building trust that they are an independent, trusted source of information.

Stu — Just to clarify, here is my belief: Quantcast data should be more accurate because they use both panel and direct publisher data in order to arrive at their people UV numbers. By using both direct data from over 10 million publishers and indirect data from their panel of 1+MM, their dataset is larger and more diverse, thereby leading to more accurate numbers.

Your belief as I understand it is: Quantcast data is less accurate than panel-only services because tagging a site significantly increases Quantcast’s people UV estimates; a site that doesn’t permit tagging will be significantly undercounted by Quantcast relative to tagged sites.

The point in my last post is that both of these are valid arguments. And without someone (being Quantcast, ComScore, IAB, etc) providing data to support either of these positions, you will have people on both sides of this argument.

Nate:
Your claim that Quantcast has tagged 10 million publishers is not true. If you go to Quantcast’s own site, they conveniently list the sites that have — or have not — agreed to implement Quantcast’s tagging. There are nowhere near 10 million “quantified publishers” listed. Not even 1 million. Not even 100,000.

Let me summarize some other relevant numbers for you that you can verify from Quantcast’s site. Of the 1000 largest sites in the U.S. that accept advertising, fully 80% DO NOT ACCEPT Quantcast’s tagging. Here are some of the more important sites that DO NOT COOPERATE with Quantcast: Google, Yahoo, MSN, AOL, Myspace, mapquest, amazon, facebook, craigslist, information.com, go.com, blogger, paypal, CNN, all.com, imb, flickr, geocities, yellowpages, whitepages, weather.com, nextag, classmates, nytimes, webmd, shopzilla …. need I continue through the other 800 sites in the largest 1000 that DO NOT cooperate with Quantcast? You should get the point already.

I think it’s particularly telling that the largest site that DOES carry Quantcast tags is wordpress while the fifth largest site that DOES carry Quantcast tags is … wait for it .. YOUPORN.com! That’s one heck of a negative reflection on the quality of Quantcast’s program, don’t you think?

Look, Nate, it’s crystal clear that Quantcast’s tagging program is not endorsed nor used by the vast, vast majority of important sites that carry advertising and who account for virtually all online ad dollars. So, all you get if you use Quantcast is a mishmash of data that is unquestionably biased toward the smaller and less relevant sites on the Internet. Ad agencies are simply not going to use such biased numbers.

My recommendation to you if you are a publisher is to buy comScore or Nielsen data. It will cost you a few bucs more than the free Quantcast numbers you use today. But, at least you’ll get something that advertisers and their agencies are using to place ad dollars today.

Just to rub salt in your wound, Nate, for making such outlandish claims, here’s a link to a page on Quantcast’s own site where they acknowledge they only have 35K publishers using their tags:http://www.quantcast.com/user/signup

John – Interesting post. Being an interactive media planner of 5 years, I have also been disappointed in Google’s Ad Planner. However, my reasoning is a bit different than yours.

None of the tools (ComScore, Nielsen, Quantcast, etc.) are going to be 95-100% accurate on web traffic for each property out there. In fact, you can almost count on any one of them being off at some point.

As a media planner, I’m more interested in being able to slice and dice the data (i.e. defining a target audience) that will get me to the most appropriate sites for my online marketing campaign. Google just doesn’t add up in this area. Unless they make significant improvements, ComScore/@plan will still be the best tools for online media planning.

I think it’s particularly telling that the largest site that DOES carry Quantcast tags is wordpress while the fifth largest site that DOES carry Quantcast tags is … wait for it .. YOUPORN.com! That’s one heck of a negative reflection on the quality of Quantcast’s program, don’t you think?

Look, Nate, it’s crystal clear that Quantcast’s tagging program is not endorsed nor used by the vast, vast majority of important sites that carry advertising and who account for virtually all online ad dollars. So, all you get if you use Quantcast is a mishmash of data that is unquestionably biased toward the smaller and less relevant sites on the Internet. Ad agencies are simply not going to use such biased numbers.

I think it’s particularly telling that the largest site that DOES carry Quantcast tags is wordpress while the fifth largest site that DOES carry Quantcast tags is … wait for it .. YOUPORN.com! That’s one heck of a negative reflection on the quality of Quantcast’s program, don’t you think?