Kissmetrics Blog

A blog about analytics, marketing and testing

Built to optimize growth. Track, analyze and engage to get more customers.

Learning More About That Other Half: The Case for Cohort Analysis and Multi-Touch Attribution Analysis

Anyone who has ever worked in marketing or advertising has heard the quote, “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” It is from John Wanamaker and dates back to the 19th century.

Fortunately, the industry has come a long way since then; and, especially in the last 10 to 20 years, new technologies have made advertising more measurable than ever. However, there’s still a considerable gap between what people could measure and what they actually are measuring, and that leads to significant under-optimization of advertising and marketing dollars.

In B2B SaaS, there are two techniques that I feel are particularly important but not used widely enough – cohort analysis and multi-touch attribution analysis. In this post, I’ll try to provide a brief introduction to both methodologies and explain why I think they are so important.

1. Cohort Analysis

A Quick Primer

If you’re new to the topic, a cohort analysis can be broadly defined as a dissection of the activities of a group of people (such as customers), who share a common characteristic, over time. In SaaS businesses, the most frequently used common characteristic for grouping customers is “join date”; that is, people who signed up or became paying customers in the same period of time (such as a month).

Let’s look at an example, and it will become much clearer:

In this cohort analysis, each row represents all signups that converted to become paying customers in a given month. Each column represents a month in your customer’s life. The cells show the percentage of retained customers of the respective cohort in the respective “lifetime month.”

So What?

Why is it so important to do a cohort analysis when looking at usage metrics or retention and churn? The answer is that if you look at only the overall numbers, such as your overall churn in a calendar month, the number will be a blend of the churn rate of older and newer customers, which can lead to erroneous conclusions.

For example, let’s consider a SaaS business with very high churn in the first few lifetime months and much lower churn from older customers, which isn’t unusual in SaaS. If the company starts to grow faster, the blended churn rate will go up, simply because the percentage of newer customers out of all customers will grow. So, if they look at only the blended churn rate, they might start to panic. They would have to do a cohort analysis to see what’s really going on.

What else can you see in a cohort analysis? Whatever the key metrics are in your particular business, a cohort analysis lets you see how those metrics develop over the customer lifetime as well as over what might be called product lifetime:

If you read the chart above horizontally, you can see how your retention develops over the customer lifetime, presumably something that you can link to the quality of your product, operations, and customer support. Reading it vertically shows you the retention at a given lifetime month for different customer cohorts. This might be called product lifetime; and, especially if you look at early lifetime months, it can be linked to the quality of your onboarding experience and the performance of your customer success team.

The Holy Grail of SaaS!

Maybe most importantly, a cohort analysis is the best way to estimate CLT (customer lifetime) and CLTV (customer lifetime value), which informs your decision on how much you can spend to acquire a new customer. As mentioned above, churn usually isn’t distributed linearly over the customer lifetime, so calculating it based on the blended churn rate of the last month doesn’t give you the best estimate. A better way is shown in the second tab of this spreadsheet, where I calculated/estimated the CLT of different cohorts.

A cohort analysis is even more essential when it comes to CLTV. Looking at how revenues of customer cohorts develop over time lets you see the impact of churn, downgrades/contractions, and upgrades/expansions:

This chart shows a cohort analysis of MRR (monthly recurring revenue) of a fictional SaaS business. As you can see in the green cells, it’s a happy fictional SaaS business as it has recently started to enjoy negative churn, which many regard as the holy grail in SaaS.

Still not convinced that you need cohort analyses to understand your SaaS business? :-) Let me know in the comments.

2. Multi-touch Attribution Analysis

Giving Some Credit to the “Assist”

Multi-touch attribution, as defined in this good and detailed post, is “the process of understanding and assigning credit to marketing channels that eventually lead to conversions. An attribution model is a set of rules that determine how credit for conversions should be attributed to various touch points in conversion paths.”

It’s easier than it sounds; and, since this is the year of the World Cup, let me explain it using a soccer analogy. Multi-touch attribution gives the credit for a goal to not only the scorer but also (gives some credit to) the players who prepared the goal, too. Soccer player statistics often calculate scores based on the goals and the assists of the players. That means the statistics are based on what could be called a double-touch analysis that takes into account the last touch and the touch before the last one.

Since the default model in marketing still seems to be “last touch” only, it looks like soccer has overtaken marketing in terms of analytical sophistication. :-)

Time for Marketing to Strike Back!

If you are evaluating the performance of a marketing campaign solely based on the number of conversions, you are missing a large piece of the picture. Like a great midfielder who doesn’t score many goals himself but prepares goals for the strikers, a marketing channel might not be delivering many conversions but could be playing an important role in initiating the conversion process or assisting in the eventual conversion.

This is especially true for B2B SaaS where sales cycles are much longer than in, say, consumer e-commerce. When you’re selling a SaaS solution to a business customer, it’s not unusual for there to be several touch points before a company becomes a qualified lead, and then many more before the lead becomes a paying customer. The process could easily look like this:

A piece of content that you produced comes up as an organic search result and the searcher clicks on it

A few days later, the person who looked at the content piece sees a retargeting ad

A few days later, she sees another retargeting ad, visits your website, and signs up for your newsletter

A week after that, she clicks on a link in your newsletter

A few days later, she receives an invitation to a webinar, signs up for it, and attends the webinar

After the webinar, she signs up for a trial

The next day, one of your customer advocates gives her a call

Close to the end of her trial, your lead does some more research, happens to click on one of your AdWords ads, and signs up for a paid subscription

If you look at this conversion path, it becomes clear that if you attribute the customer to only the first touch point (SEO) or to the last one (PPC), you’ll draw incorrect conclusions. And keep in mind that the example above is still quite simple. In reality, the number of marketing channels and touch points that contribute to a conversion can be much higher.

Data Integration in a Multi-device World

Maybe you use Google Analytics or Kissmetrics for Web analytics, Salesforce.com for CRM, and Zendesk for customer service. If you want to get a (more or less) complete picture of your user’s journey, you need to get and integrate the data from all of the major tools you’re using and track user interactions.

A big complicating factor here is that we now live in a “multi-device world.” It’s very possible that the person in the example conversion path above used a tablet device, a smartphone, and two different computers to access your content and visit your website. Since tracking cookies are tied to one device, there’s no simple way to know that all of these touch points belong to the same person, at least not until the person registers.

Going deeper into the data integration and multi-device attribution problem would go beyond the scope of this post, but there’s a lot of valuable information available on the Web. And, please feel free to ask questions or share experiences in the comments section.

Toward a Better Attribution Model

The next question to tackle is how credit should be distributed to touch points in a conversion path. A simple approach is to use one of these rules:

Linear attribution – Each interaction gets equal credit

Time decay – More recent interactions get more credit than older ones

Position based – 40% credit goes to the first interaction, 40% to the last one, and 20% to the ones in the middle

While using one of these rules is a big improvement over a “first touch only” or “last touch only” model, the problem is that all of the rules are based on assumptions as opposed to real data. If you’re using “linear attribution,” you’re saying “I don’t know how much credit each touch point should get, so let’s give each one equal credit.” If you’re using “time decay” or “position based,” you’re making an assumption that some touch points are more valuable than others, but whether that assumption is true is not certain.

A more sophisticated approach is to use a tool like Convertro, which takes a look at all touch points of all users (including those who didn’t convert!) and then uses a statistical algorithm to distribute attribution credit. The advantage of this approach is that the model gets continuously adjusted based on new incoming data. Explaining exactly how it works, again, would go beyond the scope of this post, but there’s more information available on Convertro’s website, and I assume there are additional tools like this on the market.

Is It Worth It?

Implementing a sophisticated multi-touch attribution model is obviously a large project, and so the next question is whether it’s worth it. The answer depends mainly on these variables:

Product complexity and sales cycle – The more complex your product and the longer the sales cycle, the more likely you are to have several touch points before a conversion happens

Number of simultaneous campaigns and size of marketing budget – The more campaigns you’re running in parallel and the more you’re spending on marketing, the more important it is to account for multi-touch attribution

While cohort analysis is something you should do as soon as you launch your product, I think multi-touch attribution analysis can usually wait until you’re spending larger amounts of money on advertising. Until then, spending too much money or time getting your attribution model right probably is not the best use of your resources. So, as an early-stage SaaS startup, don’t worry too much about it just yet. Just remember to take your single-touch attribution CACs with a grain of salt.

Most people just hope the math works out in their favor at the end of every month. But business is math + psychology and for those of us who would only like to focus on the psychology part – selling and marketing, we’re lucky to have sites/resources like yours that help us keep our eyes on the other half of the equation so that our marketing and selling can be as effective as possible.

I’m very much looking forward to Multi- touch attribution reaching the point where it is easily implemented as I know that strategic marketing campaigns are so much better than just doing random acts of marketing.

Both concepts are very important. This was a great article! Thanks for posting it.

As you mention, an algorithmic solution is ideal, but even a rules-based or linear model gives you better information (imho) than last/first click. However, B2B sales typically have longer, more complex sales cycles, which typically result in a lower volume of results with higher price tags (than an e-commerce site).

Would you have any insight on what the minimum size a data set needs to be to generate a reliable algorithmic model? Could a potential solution be to push the “conversion” further up the funnel and attempt to tie a dollar value to it based off of LTV & funnel stage conversion rates?

In the world of marketing automation, Marketo does a decent job of enabling marketers to determine multi-touch attribution. Marketo works backwards starting with named individuals attached to sales opportunities, then looking back to all the programs that touched those individuals prior to the point in time when the opportunity was created.

But even with a simple methodology like this, I see many marketers struggle to get reliable MT analytics for these simple but critical reasons:

1. Not setting systems up right in order to capture MT data properly and continuously
2. Everyone in marketing not following the same definitions for channels and tactics, and/or not giving leads credit for marketing successes some or all of the time
3. Sales not adding contacts to sales opportunities. Without marketing leads to trace back to marketing programs, it is impossible to determine MT – or any marketing attribution
4. Not adding all contacts involved in a deal to an opportunity when there’s more than one
5. Creating new (duplicate) contacts instead of searching for existing leads or contacts that have a long trail of marketing associated with them already – that will never get the credit they deserve when newly created duplicate contacts take their rightful place at the attribution table.

It amazes me how often this happens. Investors and executives demand to know marketing’s contribution, but too many aren’t ready or willing to address the underlying upstream data quality issues that need to be fixed to get to good attribution.

Tom, thanks for these great in-depth insights. I think the reasons you outlined are all very valid and accurate. It’s tough to get the right mix when you are tracking large data sets. We look forward to hearing more from you :)

Thanks – this is a great case for cohort and multi-attribution analysis.

To address the B2B case though, you have to move away from a single “conversion” event to a series of multiple “progressions” through the funnel. So you have visitors becoming leads, and then leads becoming qualified, and so on until the sale. There are cohort metrics for each of the step-conversions – and the most sophisticated of the marketers know how to build systems to track and analyze all of this information – and find it a very worthwhile exercise because they begin to understand what’s happening to each prospect at each stage of their journey to become a customer.

The reason I clicked through was because I’m currently reading the lean startup which I high recommend and I know Neil, you do too. Slightly off topic and to make the post better from a user point of view could be to add a few anchored text links as there is some terminology I don’t know. Wikipedia do a good job of this. Sorry it’s a little off topic but thought it may be good feedback. Great post overall, certainly good metrics to keep track instead of vanity metrics.

Great Article! I understand that Cohort Analysis is great practice for SAAS, however (how) would it be applicable for services like Uber and Airbnb? Or would you recommend a different kind of analysis? Would love to hear your thoughts.

Jeroen, thank you for your comment. IMO cohort analysis is essential for any type of business which spends money on acquiring users and monetizes those users over a longer period of time – which is true for almost all online businesses.

I totally agree, SaaS companies (not just B2B) need multi-touch attribution. Bizible.com tracked all the marketing touches for nearly 500,000 leads and found big differences between first and last touch attribution for SaaS companies. One highlight is that 18% of SaaS first touches came from referral (highest of any industry) and only 36% from paid search (lowest of any industry). You can see the highlights of the results in MarketingLand at http://marketingland.com/first-touch-attribution-search-tops-lead-generation-social-shortens-cycles-77622

I notice that for many Time Decay Attribution model examples, the sum of attribution is always 100%. I want to understand why.

Say the half life is set to 1 day, and there is 1 conversion. And for this conversion, there is an exposure 1 day prior to the conversion, Therefore, this exposure gets attribution of 50%. But there is also an exposure 2 days prior, so that exposure should get attribution of 25%. This adds up to 75%.

The way I see it, decay of each exposure is independent of one another. There is no guarantee the attribution total will be 100%. The only way I can see it being 100% is if you always attribute the remaining attribution to the direct conversion. So, in my example, direct conversion gets 25%.

So if the time between conversion and exposure is really long, then attribution for that exposure should be small, and therefore direct conversion gets more fraction of the total attribution.

Follow Us

Article Categories

What is Kissmetrics?

We're more than just a blog! Our online software helps marketers turn analytics into insights that guide decision-making and growth. Kissmetrics is different because it ties every visit on your website to a person – even if they're using multiple devices.