What big data should retailers collect?

Principles of data science in retail

The best way to understand what data to collect is to understand the principles of maximizing profit in a brick-and-mortar and online context.

In a brick-and-mortar store, you offer the same shopping layout and experience to all customers, so your goal is to implement the layout that maximizes the chance of a sale for the average customer that walks into that store. You can do a lot of this by just knowing a breakdown of sales for each store, you do not need to tie each sale to a customer. However, there are also ways to personalize the experience for customers in a brick-and-mortar context. For instance, if you knew their identity the moment they walked into a store.

In the online context, you can personalize the experience for each customer as you know his identity once he logs in, so your goal is to understand the customer as much as possible to show him the right message, at the right time, at the right place.

Data collection

Given the principles above, there is a hierarchy of importance of data when it comes to understanding customers that are ranked based on how correlated an action is to a purchase intent:

Purchases (what did they buy?)

Add to cart (what did they add to their cart?)

Product click (what products did they click on?)

Product view (what products did they view?)

The most important data to collect is purchase information (and of course the means to contact the customer via email, SMS etc.). In brick and mortar stores, this is done through loyalty programs, because otherwise, you do not know who bought what. In online stores, this is usually collected, except when you sell through a third-party like Amazon.

The next level of data - add-to-carts, clicks, and views - can be tracked online using clickstream tools like Google BigQuery and Adobe Marketing Cloud. In brick and mortar stores, the use of beacons and RFID tags are starting to gain popularity in tracking things like the number of times a product is lifted from the rack (i.e. roughly equivalent to a product click), the amount of traffic in a certain area in the store.

Competitor data

There is another type of data used to determine pricing, sizing, and product design strategy. This is information about you and your competitors' products, things like ratings, reviews, colors, pricing etc. In the online context, this is used extensively when selling on marketplaces like Amazon. In the brick-and-mortar context, brands rely on research companies to give them such intelligence.