You are here

Cohort Analytics

Cohort analysis is a broad topic, and it has many variations. Let's first give some basic definitions:

Cohort - this is a group of people or events who share a common characteristic over time

Cohort analysis - this is the study of activity / behavior of a particular cohort or a group of them over time (or other iteration).

In metacode, a cohort analysis is similar to this: 'take a dataset which has uniquely identified entity (customer in this case), define unique cohorts by which the items can be grouped (in this case we will use the year and month of first purchase), and follow the behavior of the cohorts over time (in this case the sum of the OrderValue for each cohort per timeslice)'.

It may seem a bit complex at first sight: However it is straightforward in practice. In this case we are asking the questions:
It may seem a bit complex at first sight: However it is straightforward in practice. In this case we are asking the questions:

'How do our users behave as they get further away from their first purchase date?'

'Do they keep buying equally a lot as they did in the first month of their activity, or do they phase out as time passes?'

'Is there a pattern when we compare the first purchase months of different cohorts?'

Let's load the libraries first:

library(dplyr)/span>

library(reshape2)

library(ggplot2)

library(RODBC)

Then we will connect to SQL Server and load the data from the view we created into memory:

cn

orders

Then we have to convert and format the OrderDate variable:

orders$OrderDate

After this, we will create a new dataset in memory. This dataset will have additional features that we’ll need for our graphical representation later on: