Use Cases

Recommendation engine: the holy grail of one on one cross sell and up sell marketing

Part 1. What and why of recommendation engines?

Recommendation engine in layman term is an engineering solution to model (and recommend) the preferences (or next subscription / purchase) of customer (or subscriber, visitor etc.) based on various factors (aka dimensions) including product popularity, product attributes, customer preferences either in isolation or combination of these factors.

With the uprising of new age retail format / ecommerce platforms (digital marketplaces, travel sites, catalog vendor , online media content providers) and ever increasing digital adoption by everyone around; customers have multitude of choices just in their fingertips. Therefore, providing appropriate recommendations (for continuous customer engagement) is no more ‘nice to have’ feature in digital customer lifecycle management, but something inevitably to be offered. Fundamentally the idea of personalization also stems from the same principles of recommendation engines and technically same sets of data points / algorithms are required to achieve the personalized experience.

Recommendation / personalization, in addition to achieving the obvious objective of ‘engaged customer’ also have the fundamental commercial objective(s) to acquire, up-sell and cross-sell the products. Amazon’s 35% contribution to sales and Netflix’s 75% viewership is attributed to their commercial focus on recommendation engines. (Source: McKinsey report)

Therefore for past 4 - 5 years the area of recommendation systems has been among top commercial priorities for data scientists. And with the fast paced development on AI driven application areas and interest of 3rd party data aggregators, this space has seen lot of research and sure-shot measurable outcomes.

In my 3 part series dedicated to recommendation engines, I will share my experience as a data scientist (without giving proprietary information around data and / or algorithms J ); from the simplest forms of recommendation mechanisms to AI driven more sophisticated adaptive systems. Will also share modelling logic (Python, R, SQL) or references (from the available papers). While I continue to devote time in researching this subject area, I will update my findings in due course of time.

Type 1 : Bestseller recommendation

As the name suggests this is the blanket recommendation around the popularity of a particular product (in search category) purely driven by (max) sales volume. The easiest and computationally cheapest one to model comes with the price of limited upside potential. Rather than random recommendation (across the categories), it is always better to recommend something popular (aka subscribed to / sold) within the category. This is highly product driven approach which completely disregards the objectives of one 2 one marketing. As an example ; most watched movie appearing in the content provider platform (regardless of genre, viewing rating, cast and other feature sets.).

Some of my practical observations around this approach are -

+ May not be based on the maximum transactions / sales as the criteria to derive popularity but pure product launch hype. (e.g. iPhone launch)

+ For unknown / first timers / newbies (cold start) this is the only available option to (hopefully) catch their interest and on-board them in the downstream sales cycle.

+ Doesn’t need machine learning but some simple SQL equivalent logic of picking the Product with max transactions within a SKU.

Driven by the popularity of the combination of next action with the previous one (or if time sequencing is not important then looking at affinity of actions); built on transaction data (the only customer relatable attribute is grouping the analysis at unique purchase id or customer id).

This approach is again product driven strategy and treats an individual from ‘one size fit all’ approach. While computationally this approach does not involve heavy lifting, practically speaking it often gets overburdened by outlier effect of isolated purchases. Additionally if the billing systems are flawed (overlapping of unique billing id’s, SKU’s not defined uniformly in a standard format, missing omni-channel attribution etc.), it takes exorbitant amount of time to pre-process the data for information discovery to architecture recommendation engine.

I recon, while working with a big retailer in India about a decade back, we took almost 80 days just to put data in order and just about a week to come-up with affinity model. Nevertheless this was a big hit and was predominantly used for planogram for all the 80 odd outlets. This was about a decade later than the Thomas Blischok’s legendary study (some say it was largely fabricated) of Osco drug stores diaper-beer affinity was making rounds in disparate literature sources (in early 90’s).

While the hype of diaper-beer folklore is still spoken about, the fact remains that analytics community (including myself) continued to work in product affinity experiments and it still is widely used (Amazon’s ‘frequently bought together’ for example) recommendation framework. It also happens to be widely used approach to drive products bundling. (e.g. discounted games with console).

From a data science point of view there is no complex algorithm involved in affinity analysis but rule based pattern recognition (pattern of coexisting products or behavior's) in historical transactions. In other words the objective is to determine the likelihood (in confidence terms) of co-occurrence in a single invoice times the frequency of such invoices. The more frequent simultaneous occurrences the better is the support to make recommendation. The terminology of association analysis revolves around Support (Frequency of co-occurrence), Confidence (Sort of pairing reliability i.e. other items appearing in conjunction) and Lift (measure of pairing gain against when the occurrences / purchases appear in isolation).