When it comes to data your website log files or javascript data is not the only source of information. There are a number of resources on the web that are available to us for free, resources that can add to what you already have or in some cases fill in holes in your data strategy.

This post is a collection of a few of these resources and with specific examples of reports you can run and action you can take for your company to gain a competitive advantage (however small or big).

Two important points first:

1) The overall goal of this post is to stretch our minds and hearts beyond traditional sources of data (click-stream and Google Analytics or Omniture or WebTrends or HBX etc) and beyond what we think of in our day to day jobs. There is more to web analytics than that, as we all already agree.

2) None of these data sources are prefect. Each has its own unique limitation that you should take time to understand and internalize. Yet every one is a very useful resource that can add value to your decision making arsenal (and give you data and insight in areas where we probably have nothing).

At the very minimum these are examples of insightful analysis you can / should do even if you don’t use these sources. In this post we will cover:

Demographic Prediction

Search Funnels

Keyword Forecast

Traffic

Keyword “expansion” (create a long tail, arbitrage)

What: Demographic Prediction.Where:Microsoft adCenter Labs. (Data mined from MSN users.)Why: Few website owners have any awareness of the demographic nature of their visitors and yet knowing if your visitors are from Mars or Venus could be a major influencing factor in your website design / experience. If you don’t have access to HitWise to mine it for demographic information then the MSN adCenter Lab is the next best place. Here is the view for Occam’s Razor….

Action You Can Take: The more popular your website the higher the likelihood that the above data is accurate, but as with all things web data YMMV. I am both surprised and pleased at the above distribution, I would have expected the Male population to be higher simply because of what I see in the industry and conferences. I can take this data and, for example, try to improve my writing style and presentation that will appeal to both genders (and you can do the same with your website).

Another example is that we often underestimate the impact of the “youth” population. Their habits and content consumption preferences are radically different from us “old” people. The more your skew young the more you need to ensure that the right mind-sets are in-charge of your Product Marketing, Customer Experience and Customer Retention strategy.

What: Search Funnel.Where:Microsoft adCenter Labs. (Data mined from MSN users.)Why: A recent study indicated that 80% of the current Internet traffic started browsing at a search engine. The result, besides the $500 that each google stock costs, is that we are endlessly fascinated by keywords driving traffic to our site and conversion rates and all that good stuff. Increasingly though it is becoming harder to gain a competitive advantage with that analysis from data just in our clickstream tools.

It is highly likely that 10 – 15 keywords drive 80% of your traffic (unless you run amazon or ebay etc then you do have a really long tail). In this case one of the most basic things you can try to understand is what do people look for before they search for your top key phrases and what do they look for after in order to get glean customer intent behind various keywords……

Action You Can Take: In the Pre-Funnel there are insights about what is in peoples mind before they think of you. It outlines what your competitors are feeding you as well as what non-branded key phrases are most relevant (and a couple surprises, I have no idea what pella is but Peachtree should look into it).

Ditto for the Post funnel. It makes sense that after people search for Peachtree they want to go back and search for quickbooks (which is a competitive product to peachtree). But it is surprising that they go back to look for: peachtree business products, peach tree accounting, peachtree software and peachtree accounting. If the results of peachtree (which is a branded term for Sane Solutions hence easy to “capture” in terms of SERP listings) are optimized they their web traffic should be landing on pages that are optimized to give answers around all of those four keywords, their traffic should not have to go back and search for that detail again.

As you can see above there are fascinating insights that are truly actionable (and I promise they will make you look and sound as smart as you are because remember you can not only do this for your own keywords but also for your competitors, talk about kicking butt!). You can use other tools you might have access to in order to construct your search funnels, but it is recommended that you do atleast for your most important keywords.

What: Keyword / Key Phrase ForecastWhere:Microsoft adCenter Labs. (Data mined from MSN users.)Why: Because its good for you! : ) Actually this one is brilliant, and it looks pretty to boot. Our companies usually have internal plans for Search Engine Marketing (Pay Per Click / PPC) campaigns around our product launches or selling seasons. What this wonderful tool allows you do to is get a outsider’s opinion of what the next few months looks like for your Top 10 keywords And compare it to your competitors (don’t ya love that?).

Action You Can Take: (The screenshot was taken in early Oct.) It is to be expected that during holiday season sales of digital cameras will pick up. But I wonder if the folks running Nikon and Olympus web strategy know what Canon is doing so much better than them, and is predicted to do much better (slope of the line in the yellow area of the graph) than them during the holiday season. Could they use this data to adapt their strategy to do much better against Canon? Maybe come up with a more robust strategy around non-branded keywords to counter this strong brand keyword trend? Perhaps.

It is also interesting that Canon skews so much heavier in the under 18 and over 50 markets (opportunity for more robust strategies around affiliate marketing either to get more entrenched in those two age groups, say myspace and AARP, or diversify in age groups where they are not so big). It is perhaps distressing to Olympus that they skew so heavily Male, given that Women make most purchasing decisions (maybe Olympus is trying to differentiate itself from its competitors or this is the reason for it being at the bottom in the top graph, either way food for thought).

On a slightly interesting side note I wonder if the CIA / NSA can use this forecasted trend for some helpful purpose.

What: TrafficWhere:Alexa. (Data mined from Alexa toolbar users.)Why: I can see howls of protest at this recommendation. Some of them justified. So let us get this out of the way: Alexa collects data from several million users who have installed the Alexa toolbar in their browser. It has a bias towards Windows users, who use IE and ratings in Alexa higher than 100,000 are not reliable. Read This to learn more, especially this. Inspite of that if you want to get traffic trends against your competitors while holding all the bias as equal for you and your competitors then Alexa can be a acceptable resource. Especially if you do trends over time.

Action You Can Take: As outlined in the competitive analysis post (first, second) it is really easy for us to get into our silo with our internal company data. Is traffic on your website going up because you are doing all the right things or it is going up because overall general web trend is going up. You can get good insight on this if you use Alexa. So ignore the number on the Y axis and look at the trend.

In the above trend the Google Analytics blog has much higher traffic (multiple times) than the Occam’s Razor blog, you can notice trends when the GA blog has had major posts. It is interesting to see comparative trends between Eric’s and Robbin’s blogs as well. Ignoring the Y axis, and the number it lists, the trends of each website in comparison is insightful (and, if these were businesses, actionable).

You can punch in any URL you want and compare trends (not numbers) and all things being equal (no one is particularly trying to get you) you can find some good insights.

Important: Just to repeat one last time: Please know that Alexa data is only reasonable if individual sites are ranked under 100k, the data skews Windows and IE users, ignore the actual number, compare “like-minded websites” to hold the bias neutral, and please only compare longer term (more than two month) trends.

What: Keyword ExpansionWhere:Google AdWords. (Data mined from Google AdWords.)Why: With so much web traffic starting at a search engine it has become imperative to be very competitive on search. The fastest way many have adopted to compete is spend money on Search Engine Marketing (SEM), or Pay Per Click (PPC), campaigns. Yet the competition is really hard out there especially for the keywords that you can think of or get from your web analytics application. How do you expand your keyword list to find words / phrases you don’t know? How do you start to look for opportunities for arbitrage? Worry not, Uncle Google to the rescue with the unassumingly named Keyword Tool.

Action You Can Take: (Thanks to Marshall for pointing me to this tool.) You can go into the tool and type in a set of your most popular keywords. For no apparent reason I used: dell, laptops, dimension, xps. Once you hit Get More Keywords the tool gives me a list of 182 keywords that I could consider bidding on to build a robust portfolio of keywords. Not too shabby for five seconds of work, just try it with your top four or five or fifty keywords and you’ll be surprised (and you will get more praise from executives since you are not just reporting things from your analytics tool). But it gets better.

I think I am the last person who has “discovered” this tool but I found it fascinating that it would tell me, atleast in a crude way, how much competitive bidding I can expect on each key phrase and, awesomely, how much “inventory” is there for the key phrase (so we are not all descending on it like piranahs, and of course paying lots). Now you have not only expanded your keyword list but also have some intelligence on where might be golden opportunities such as….

You can find opportunities for arbitrage where there is not really a lot of competition (say XPS Motherboard) but lots of searchers. There are lots of these in the list. There are no hard numbers in either column but atleast Google has gotten you started on a path with some guidance (rather than you going in blind).

Of course Google benefits from you spending money, but they would have gotten your “unintelligent $$$” anyways but this way they get your “intelligent, hence more, $$$” over time. Plus even if you don’t want to use AdWords you can still use the tool and bid some place else after you have expanded your long tail.

In summary: Web Analytics is bigger and broader than just the data we have for our websites. By expanding our scope to other outside data sources we can understand out existence in this universe better and then react to it. (Two interesting earlier posts on Competitive Intelligence: Why should you and What should you do.)

Quite interesting. I have been spending some time myself in what I call "ecosystem analysis" (or did I pick it up from someone else?). I totally agree that we can not just look at the "node" (i.e. the site itself, which is covered with behavioral and attitudinal analyses, and testing/optimization of all sorts), but at its "environment" as well (i.e. competitive pressure, reputation, blogosphere, SE ranking fluctuations, etc.).

From the analytical point of view, there is still a LOT of work to be done in putting everything together, but I suspect that our field will witness a major paragigmatic leap in the coming 24 months. Our understanding of "what's going on" will be so tremendous that Web Analytics will be synonymous to interactive marketing. Managing a web site without mastering WA just won't make any sense (already doesn't, if you ask me ;-) ).

Google Trends – another tool that also allows you to segregate results by geography (see Michael's comment above).

It's useful to know that when you download the data you show for the Google AdWords tool, the bar graphs are converted to numeric values, which is a great way to quickly find what you did with XP motherboard.

Hi Avinash! Do you know the difference between "general distribution" and "predicted distribution" on the demographic prediction tool? I've read the MSN explanation and I didn`t understand. Thanks for your help!
Regards
Pablo

Hi Avinash; This post has the only mention of SERP on your blog… interesting! Left your book back at the office… would love to know what you think about SERP – and gee, do use any particular tool for tracking?

David: I'm afraid predictive analytics and forecasting are a little like trying to find God. The allure is irresistible, and you have to be very, very specific about what you are looking for because the number of options are numerous.

:)

But on a serious note, it is really that complicated. Both of those areas are vast and depending on what you are trying to do there might be tools available or you might have to build from scratch with a bunch of statisticians and scientists.

Trackbacks

[…] Five Free ?Advanced? Web Analytics Examples: Look Outside, Think …It is to be expected that during holiday season sales of digital cameras will … for it being at the bottom in the top graph, either way food for thought). … Not too shabby for five seconds of work, just try it with your top four or … […]

Avinash Kaushik asked me for a list of free tools I like a couple of months back for a blog post on Occam's Razor and I pretty much gave him the list I published at Big Green Blog recently. The post went……

[…] Shamelessly stolen from Avinash Kaushik's post on Five Free Advanced Web Analytics Tools, Microsoft AdCenter Labs has a predictive index for site demographics culled from their userbase. Enter in a URL for a report. […]

[…] If you like web analytics and playing around with data, you'll want to read this post from Avinaush Kaushik on Five Free "Advanced" Web Analytic Tools. They are also handy tools for those doing search engine optimization or PPC arbitrage. […]

[…]
It’s also interesting to look at the demographic predictions for particular search terms. This could be quite useful if you are running a pay-per-click ad campaign and want to target your ad copy appropriately.

# Comparative Traffic Analysis: Any web analytics tool will help you get a good feeling for how your traffic is growing. It's easy to get excited when your traffic jumps up. When your traffic goes up 20% you are doing great, right? Not necessarily. What if your competitor's traffic doubled in the same time period? Point is, you want to see if you are growing faster than competition.

Then enter your competitor's sites in the "Compare Sites" boxes under the chart, and look at it again. You can now monitor the performance of your sites traffic against that of your competitors, and can easily see if you are growing faster than they are, or vice versa. In fact, look at the competitor who is growing the fastest and spend your time studying their strategy. It may give you some ideas.

Bear in mind that the Alexa tool does NOT provide a good measure of your traffic, because the sample size of it's audience is too small, and it's audience is a bit skewed. But using Alexa to compare sites in the same niche makes great sense, because the skewing of the audience will affect all such sites in the same way, and the audience is large enough for that.

# Competitor Search Term Analysis: Hitwise and AdGooroo are two great tools that allow you to see what search terms are driving traffic to your competitor's site(s). Are you a newcomer in a competitive space? It sure would be great to know where the leaders in your space are getting their traffic.

Or maybe you are the leader, but a strong challenger has emerged. What keywords are driving their challenge to your leadership? It would be great to know that too. Of course, these tools cost money, so they are not for everybody. But consider them seriously if you are able to afford the additional expense. The return on that investment is likely to come to you quickly.

[…] However, I still use Alexa as a tool. It was a blog post by Avinash Kaushik that taught me how to still use it as an effective tool. Quite simply, use the Alexa feature that shows comparative traffic levels to compare your site's traffic to that of your competitors. Because your competitors are in the same business as you are, the bias problem no longer is a factor to worry about (because the bias will affect all the compares sites equally). For most businesses this will provide a quick way to compare the relative web site traffic levels in their industry. So the accuracy problems are real, but there is still a way to use the tool to extract useful information.. […]

[…]
However, when it’s in the top 10000, it’s safe to say that even if it isn’t as accurate as google analystics, the site is still wildly popular. Alexa’s top 1000 list seems accurate, doesn’t it?

This Article pretty much lays it all out:

Important: Just to repeat one last time: Please know that Alexa data is only reasonable if individual sites are ranked under 100k, the data skews Windows and IE users, ignore the actual number, compare “like-minded websites” to hold the bias neutral, and please only compare longer term (more than two month) trends.

I have yet to claim that Alexa is accurate when compared with a more complete rating system like PageRank or Analystics, nor have I claimed that Alexa’s numbers are even to the same scale as you’d find if we could view the real stats. It’s a fact, however, that Alexa doesn’t lie and that Doomsday didn’t pad the numbers so well that Alexa think’s he’s in the top 10,000 when he’s not. Those numbers are real… they just don’t represent the whole of the online community. Just a sampling of them.
[…]