Clicky

For media sites, few issues are as important and complex as paywall analysis. Where to put a paywall, how to structure it, how to limit the impact on advertising and SEO not to mention customer experience – are all critical issues. A few posts back, I wrote about a set of techniques (using plug-ins or eVar counters) to measure advertising risk with a registration or paywall barrier. Today, I wanted to cover some of the techniques you can use to understand what happens when a visitor actually encounters a wall.

At the most basic level, it’s essential that you be able to measure when you presented a “barrier” to the user. If the barrier message is a popup or embed in an article template it still needs to be measured explicitly and independently. For most sites, this means setting a “page view” or, at the very least, an “event” when the barrier is encountered.

But setting a page view or event isn’t going to answer some of the most important questions about what happens when a barrier is encountered. Questions like:

1. What percentage of people register when encountering the wall?2. Do people register after encountering the wall multiple times?3. Is some content much better at driving through the wall?4. What’s the subsequent behavior of people who encounter the wall and turn away?

Of these, only the first question is relatively easy to answer – at least in its most straightforward incarnation. If you know how often you presented the barrier and you know how many registrations you got, you have a de facto conversion rate.

One potential stumbling block is that many sites have a common registration confirmation page or process – you might register without ever seeing a paywall barrier. If that’s the case, then you’ll need to measure only the visitors who come from the paywall.

There are a great variety of techniques for getting at this type of number in a system like Omniture. You could use pathing to get at the direct number, you can set an eVar or campaign (it’s basically the same thing) and track to the conversion event to get at an indirect number. You can also use visitor and/or visit based segmentation in the Data Warehouse or Discover to get at visit indirect counts and visitor indirect counts.

It’s not so easy, however, to answer the second question. If you set a page and eVar for the paywall barrier and an event for registration, you’ll be able to track instance, visit and visitor counts for both barrier and complete. Unfortunately, that doesn’t really tell you anything about success rates based on how many times the visitor has encountered a paywall.

To get at a number like that, you have two choices and both hearken back to my discussion of measuring paywall impact on advertising: you can use an eVar counter or you can use a cookie.

If you set an eVar counter every time the visitor encounters the wall, you can track the state of the counter on registration conversion. This gives you exactly the information you need to answer the second question. In fact, you even get fancy and use multiple eVar counters with different attribution periods. For most sites, I’d suggest using a visit-based attribution and a time-based attribution that tracks over a seven day period. With these two counters, you’ll be able to track whether multiple instances of hitting a barrier in a session or multiple instances over a discreet period of time are most effective in driving wall conversions.

If you’re not an Omniture shop, you can duplicate an eVar by setting a javascript cookie with a counter you increment each time the barrier page loads. In the tag, extract the counter value and pass it as a variable. In fact, you can be extra-clever and pass the counter value appended to a string like “paywall-barrier” as a campaign or internal campaign value.

If you do this, you get a campaign with the latest number set each time the barrier is encountered. If you set the campaign attribution to last, you’ll be able to track the number of times each visitor encountered the paywall before converting. One really nice thing about this method is that you’ll get all the reporting that comes with setting a campaign – usually the richest set of reports for most Web analytics tools.

This brings me to question #3 – how to find the content that is most effective in driving visitors through the wall.

In Omniture, this is a fairly straightforward process. The easiest method is to set one or more eVars with content descriptions (typically you’ll set the actual article/content id and some type of content category or sub-category) when the paywall is loaded. By comparing instance values on the barrier page to the eVar states on registration, you can measure which specific pieces of content and which types of content category are most effective in driving through the paywall. If you sub-relate the content eVar to the counter eVar or use an eVar that combines the two, you can also measure effectiveness of content by number of paywall encounters.

#4 (what happens after visitors refuse the wall) is one of the more challenging questions to deal with in a Web analytics tool. It’s always hard to separate out before-and-after sequences in tools that deal almost exclusively in aggregates. You can try using pathing, but it’s insanely complex for most sites and always limited to the analysis of the current session.

Out-of-the-box, data warehouse isn’t going to help much either. You can segment visitors who hit the paywall and didn’t register but you have no way to tell what part of their behavior is prior to the paywall and what part is after they refuse the barrier.

One approach that we’ve used for this type of problem is actually quite generic and can help with all sorts of analysis projects; it works by setting a time-stamp and/or a page depth on every single page event. You can do this with any Web analytics system using a cookie (it can also be done with Vista Rules in Omniture).

It may seem redundant to add page depth and time-stamp to variables since they are already captured by the Web analytics tool. It isn’t truly redundant, because, although they are captured, most Web analytics tools give you very limited access to the either variable in reporting or segmentation.

By placing timestamp and/or page depth (e.g. 1,2,3, etc.) into variables, you give yourself the ability to pull much more detailed information (essentially server call level) from the data warehouse. This gives you the capability to pull the raw data and then analyze post-barrier behavior in a tool like SQL-Server or Access.

That analysis is still going to be a fair amount of work, however. There are some other techniques you can use specific to your paywall that can make analysis quite a bit simpler.

If, for example, you’ve used a cookie to count how often the wall is presented, you can use that cookie to create a prop variable that contains the page name whenever somebody views a page AFTER seeing a paywall. Setting up your system this way is cool for several reasons. If you’re using Omniture, you can turn on pathing in this (prop) variable – allowing you track the actual paths post paywall (including pathing in subsequent visits) as well as the aggregate page counts. If the prop is populated on Entry, you know it’s a subsequent session. You can even append the value of the paywall counter to the pagename stored in the prop – allowing you track pathing separately for each paywall encounter. Naturally, you’ll want to reset the cookie value to zero after you get a Registration Thank You event so that you don’t mix logged-in page views in the prop variable, but it’s a good idea to leave the actual Thank You page in the prop and the path so that you always know which paths ended in success.

Put these techniques together, and you'll have a nice picture of what happens when visitors encounter your wall. You'll know how many visitors register and how many don't. You'll know whether or not visitors are more inclined to register after multiple wall encounters, and you'll know what they do when they bail. You'll have a real picture of what impact the wall is having both in driving registration and in changing the user experience.

Not every measurement issue justifies the type of tag tuning that I’ve suggested here, but for a content-based site, a pay or registration wall is a complex and critical site issue; it more than justifies and will amply repay the effort to measure properly.

[I hope everyone enjoyed their Turkey and football! A last reminder, too, that I have a webinar sponsored by Webtrends this Wednesday on Database Marketing and Web analytics based on my whitepaper. If you’re interested, you can register here or just drop me a line for the whitepaper.]

This is going to be a shorter than usual blog – not because we are coming up on Thanksgiving – but because I’ve been on the road this weekend. I’ve been visiting Southern California looking at X Change venues and seen some really lovely properties.

I think everyone enjoyed the resort feel of Monterey, dinner at a venue like the aquarium, and, of course, the feel of being surrounded by the ocean. As special as the Ritz and St. Regis in San Francisco were, the Monterey location was just better.

So I’m looking for a resort type property again – including a possible return to Monterey. Some of the top properties on my list are down in Southern California, however. This weekend I’ve visited Terranea in Palos Verdes (Los Angeles), The Ritz Carlton and the St. Regis in Dana Point, and the Grand Del Mar north of San Diego. Tough work.

I believe all four made the Conde Nast Top 100 Resorts in the United States list and it’s not hard to see why. They are each spectacular properties – and any one of them would make a fine venue. In fact, it’s going to be hard to sort out which one to choose.

Terranea features a relaxed ambience, a new facility, an incredible setting jutting out on very tip of the Palos Verdes Peninsula with water views that provide both sunsets AND sunrises over the ocean.

The Ritz Carlton at Dana Point is perched over the ocean with trails down to the beach and rooms (including meeting rooms) that open up to the Pacific Ocean. The service is all Ritz Carlton but it’s a sunnier, more relaxed property than you’d expect. This weekend there was a surfing contest on and the hotel was lined with hand-painted surfboards from professional surfers being auctioned off for charity.

Not more than a few stone throws away is the St. Regis. Not quite on the ocean but perched higher on the hill it has all the elegance and sophistication that I loved in the Regis in SF along with gorgeous outdoor venues, the obligatory jaw-dropping pools, and some of the nicest rooms I’ve yet encountered.

As I write this, I’m fresh from an after-dinner swim in the lush family pool of the Grand Del Mar north of San Diego. It’s colder than usual after a big weekend storm and the pools were literally steaming into the night air. No, it’s not on the ocean but the setting is pristine, the rooms lovely and spacious, the service as polished, professional and friendly as any I have yet encountered.

How to choose? I have no idea. So if you are an X Change veteran and have thoughts or preferences on venue, let me know. I am open to persuasion, but I know that no matter where we end up next year we have the opportunity to have another wonderful event.

I’ll return next week with the second part of my post on Paywall analytics. Until then, Happy Thanksgiving!

The gradual evolution of Web analytics into database marketing has been much on my mind lately. Or perhaps it would be more accurate to say that the evolution of database marketing analytics into Web analytics has been much on mind.

When I co-founded Semphonic more than a decade ago, I came from the credit card database marketing industry where I specialized in building behavioral targeting based on card transaction behavior. I thought it would be more of the same - lots and lots of data, all of it behavioral. That similarity turned out to be, largely, an illusion. The segmentation and targeting techniques we brought over from the credit card industry didn’t work nearly as well with anonymous page-view data from the Web. In retrospect that hardly seems surprising but we were as shockingly naïve about the Web as everyone else.

We had to re-invent for ourselves what Web analytics meant and we came up with some of the same answers as everyone else and some that turned to be fairly unique and interesting.

In the last few years, however, we’ve seen a gradual re-introduction of database marketing concepts into Web analytics and, perhaps every bit as important, the gradual introduction of online behavioral data into traditional database marketing analysis.

The forces driving this overlap are several:

A deeper understanding of online behavioral data and better access to it from Web analytics data feeds.

The increasing reliance on digital marketing mechanisms that can be driven from that data.

The dramatically increased affordability of data warehousing systems that can handle the volume of online data.

Perhaps most important is the demand to do something REAL with all this data. I don’t despair of classic Web analytics in this regard, but there is little doubt that bringing online behavior into the realm of database marketing is a likelier path to success for many.

Put these together, and you have a powerful tide pushing analysts on both sides of the fence to use Web data for database marketing. I can see that in our business, where we have a fair number of projects going on that cross the traditional boundaries of Web analytics, BI, and database marketing.

I can see it, too, in the new technology coming from vendors in our space. Webtrends, for example, just introduced the latest rev of their Data Mart product, and it’s a significant step forward in the evolution of the product and in advancing the relationship between Web analytics and Database Marketing.

The Visitor Data Mart from Webtrends is designed to provide true database marketing capabilities around Web analytics data. It provides a nice visual segmentation builder that is very much part of the “new look” Webtrends. It’s slick, easy, and a pleasure to drive. The Mart is a lot more than just filter-based segmentation though. It’s designed to be the center of a true Online database marketing system – providing automation of key tasks around online data models, list development, integration with existing marts, testing and production.

Until now, however, it’s lacked some fairly important capabilities to a database marketer. One of the most important – and in some ways inexplicable – deficiency was the inability to segment based on any form of visitor scoring.

I say inexplicable because Webtrends has had, for three years, a very nice product designed to do just that. I’ve always been a fan of Score – as you can see in this blog from the dawn of Web analytics time. But living on its own, it’s never seems to have gotten much traction.

In their November release, the Webtrends folks made the obvious move and put Segments and Score together in the Visitor Data Mart. Knowing our focus at Semphonic and my fondness for Score, they asked if I could put together a whitepaper on why this particular integration is a powerful solution to some of the challenges around database marketing with online data.

I finished it early this month and the whitepaper is now out – along with the new release of the Visitor Data Mart. You can get the whitepaper here. It’s a pretty good introduction to the topic and I think if you have an interest in database marketing and online, you’ll find it interesting both to learn more about Score and Segments and also for the topic itself.

I’m also going to be doing a webinar on the topic – and it will be very focused on the nature of online database marketing. It’s free, of course, and you can sign up for it now!

I’m hoping it will prove to be a nice little pre-Christmas stocking stuffer for anyone who’s thinking about or trying to move from Web analytics to database marketing or from traditional offline database marketing to online targeting!

One of the most enjoyable and interesting Huddles at X Change was the session I attended on Predictive Analytics. It was a chance to hear and talk with some of the companies that are really pushing the boundaries of Web analytics, often with advanced data warehouses, sophisticated data mining tools and a small army of dedicated analysts. I’m all for that, of course, but it can be frustrating, even discouraging, to listen to if you’re at a company that simply isn’t blessed with that level of resource or commitment. Is that what it takes to do Predictive Analytics?

In some cases, the answer simply is yes. Most advanced analytics techniques will require a new tool set and a data feed from your Web analytics tool. If your online data volumes are large, that necessarily puts you into the realm of advanced warehousing and systems like Quantivo, Netezza or Aster. It is a big commitment.

Not every predictive analytics application is quite so demanding, however. One of the techniques we discussed in that session – the use of predictive modeling to understand how industry and market forces are shaping your site traffic and performance - is well within the reach of any organization regardless of Web analytics tool or size of the organization.

Trends in your industry and in the broader economy will inevitably have significant impact on a business. Was your traffic up 5% this quarter? That’s great, unless your industry’s traffic was up 10%. Was your traffic down 5% during the great recession? Is that good or bad? Is your site conversion performance down because people simply have less disposable income?

Questions like these are vital, particularly in turbulent economic times. To know whether your actual performance was good or bad, you have to understand what your expected performance was. That expected performance is typically a function of past site history, current marketing data, and econometric or external data.

I’m using econometric data as a catch-all for every sort of exogenous data. For one of our clients (a traffic alerts site), weather turned out to be a critical external factor! For many businesses, key external factors include high-level economic indicators, market movements, industry trends and even competitive marketing spend data.

To build this type of model, you need to collect likely econometric and external data, past campaign data, and site history for key metrics like traffic, campaign-sourced visitors, conversion rates and revenue per visitor.

What type of econometric or external data is appropriate? There’s no one good answer.

If you’re in the housing industry, you’re likely to look at measures like housing starts, interest rates, new and existing home sales and median prices, economic leading indicators, consumer confidence measures, stock indices, REIT indices, etc.

For almost any company, it’s worth looking at least one or two broad econometric variables like consumer confidence, hiring plans, or stock market indices. It’s essential to include your own marketing spend data – since nothing will have as direct or dramatic an impact on your site performance as marketing spend. You’ll typically want to include that spend data either as dollars/day or in terms of the core measurement for the channel (GRPs, impressions, etc.). Either method works. If you can get (or estimate) competitive spend data all the better.

Getting all this data isn’teasy. The best sources are those with publicly available data updated for any time point. Stock market indices are good example. You can get stock market prices for any date and for any date range, making it much easier to integrate the data with site metrics. Research numbers tend to be published on a less frequent basis – monthly or quarterly. The more often the numbers are published, the better from the standpoint of the analysis.

Many sites only have a couple years worth of reliable data. If you are trying to correlate quarterly data, you may only have 8 data points – not enough to work with.

Which brings up another interesting point about this type of analysis – the incorporation of time is essential. Some econometric or external factors will correlate in real-time with your site data. The case I mentioned above - weather for a traffic site - is a good example.

Many external factor function more as leading indicators of performance. Unless you’re a brokerage (and based on our studies not even then) you wouldn’t expect stock market performance to correlate in real-time to your key site metrics. Short-term jumps and dips in the market are just noise when it comes to the broader economy. Your times periods have to be long enough to have a reasonable chance of impacting the model.

Which, as you can see, leads to one of those frustrating analytic balancing acts: your time periods need to be long enough to be significant but short enough to generate sufficient data points for analysis. It’s a trade-off for which there is no one right answer.

Incorporating time doesn’t just mean picking the right unit (days/weeks/months/etc.). If the external factors are leading indicators, they may not correlate at all with site metrics from matching time periods. Many basic analysis techniques simply line up data points and look for relationships at a set of fixed periods of time. That won’t work well here. You’ll want to analyze each of the factors as a potential leading indicator across multiple time periods.

This does take a little bit of work, but the good news is that it doesn’t require any huge investment. The real beauty of this analysis is that the number of data points is inherently quite small – even if you’re tracking at the daily level for 5 years, you’re only dealing with a few thousand rows of data. You can pull site metric data into an Excel spreadsheet or small flat file, do the joins to external data on a PC, and conduct the whole analysis on a laptop computer.

About the only thing you’ll need is a good statistical analysis package (you could use Excel but I’d recommend something a little richer). SAS and SPSS are a little bulky but both run on the PC and can easily handle this type of work. You might also consider something like JMP or Statistica.

At Semphonic, we’ve done more of this type of work for clients in the last few years than in the past. That reflects a couple of factors: the growing sophistication and maturity of the market, improvements in thinking about Executive Dashboarding and management reporting, and, of course, a difficult economic climate that often has effects on site performance that are too obvious to ignore.

Of all the types of predictive analytics we discussed at X Change, this type of project is the easiest to tackle. It doesn’t require a data warehouse or Web analytics data feed. It doesn’t take expensive tools or a large team. It should be interest to almost any organization and it can significantly improve the quality of your management reporting and your analysis of site and marketing-spend success.

I had such a busy week that I didn’t get a chance to reply to several interesting comments about the Paywall blog and the Customer Support blog. In this post, I’m going to address the Paywall comments.

The question from Carson was this: “Would a path length report cover most of these paywall reporting issues? In Discover I can add metrics to this report, filter by content type, which gives me a pretty good idea about how PV caps might impact revenue.”

It’s a pretty good question and I’d say that short of the solutions I recommended, path length reporting using Discover is about the best approach you have to Paywall analysis. That being said, it isn’t all that great an approach in many situations.

The main problem with path length reporting is that it’s inherently a visit-based analysis. You can see how many visits exceed X pages of a given type, but you can’t figure out how many visitors exceeded X pages over a given period. So even though you can use path-length to get a good sense of how many pages are at risk in a given session, you can’t really figure out how many pages are at risk over some period of time. If you are considering a registration or pay wall on a visit basis, path length analysis in Discover should work okay. If your registration or pay wall is time-based, however, then path length simply won’t work right.

Tom Betts made this comment: “This is a gap for most web analytics tools and yet another example of where looking at single averages isn't so useful. Your earlier post on using SQL-based solutions for web analytics is particularly relevant here too. If the data is structured correctly it's trivial to count the number of users viewing 1,2,3…pages.”

This is absolutely right. Paywall analysis of this sort is trivial in a Data Mart – even if the design isn’t all that great. The most common (and worst) method of loading Web analytics data into a data warehouse or mart is to simply load and leave the event level data pretty much as it’s received from Omniture, Webtrends, Coremetrics, or Unica. That huge event-level table isn’t really a good data model for lots of Web analytics tasks, but it happens to work well enough for this kind of analysis. Provided you can spin through the data pretty quickly, it’s trivial, as Tom points out, to get a good Paywall report.