Big Data

Discover Deep Insights with Frequent Pattern Mining (FPM)

By Jason Lobel • February 06, 2014

Today we are introducing the Frequent Pattern Mining (FPM) algorithm for Swift Predictions. FPM, similarly to Apriori, is part of that group of algorithms considered unsupervised learning. They are the most common algorithms for discovering frequently co-occurring items in large datasets. For example, discovering which products are frequently bought together in a supermarket, or how some products can influence the sale of other products.

Apriori is known to have computation limitations when the dataset is large. This is one of the reasons our engineers at SwiftIQ have focused the development efforts on a distributed version of the FP-growth algorithm which is capable of processing millions of records in a matter of minutes. We first extract behavioral data like orders or web actions through our data collection engine in Swift Access. The data is then passed to our Hadoop cluster for processing by our data mining engine, Swift Prediction. Once this is processed, what can you expect to learn? For instance, if customers are buying a certain brand of chip, you may discover how likely they are to also buy salsa and also what kind of salsa. This algorithm provides deep insight into product associations that affect merchandising, promotions, personalization, store layouts and more. In addition to basket analysis, FPM can be applied to other retail behaviors, as such web click paths, and search terms, among others.

In the below example, using FPM we can discover actual shopping behavior between Miller Lite and Coors Light products. It appears that Miller beer was purchased more frequently with foods to barbeque like pork tenderloin, hot dogs, and sweet onions, while Coors was bought often with healthier foods like bean salad, fruit broccoli, cheese, low fat milk. That is very actionable to merchandising and marketing teams to distinguish how they will promote their various products and to what type of consumer.

Discover Product Affinities: A High Level Tutorial

To understand the purchasing habits for milk, bread, butter and beer, a retailer must create a table of every purchase. In the illustration below, bread was purchased on four of the six occurrences and three patterns were observed. Bread and milk were purchased the most frequently together at 66% of the total orders (the "Support", which defines the proportion of transactions which contain an item set). 1 out of every 6 shoppers purchased bread and milk with either butter or beer. Or looking at it another way 25% (the "Confidence", defines the proportion of transactions containing X which also contain Y) of the time bread and butter were purchased, beer or butter were bought.

In this example, the dataset is miniscule. However, some supermarkets and big box retailers generate thousands of orders per hour across their entire store footprint with tens of thousands of items sold in a given store. No human (at least that we know of) is capable of mining through this amount of data to identify associations between products at such a granular level in an efficient manner. FPM exposes demand-side purchase trends in what people buy and how they buy them. The only data source required to apply FPM is the point-of-sale transaction log files, which are mined and the patterns extracted from them. Stay tuned for future blogs, which will examine some early case studies.

User Interface

The FPM user interface includes a list of the most frequent patterns and a filter (e.g., to search by item, keyword, etc).

Visualization

Swift Predictions provides an API for the FPM results and a JavaScript-based plot graph application for convenient interpretation. This visualization is interactive and allows a user to hover over the bubbles to access the Support and Confidence metrics that indicate the magnitude of the behavior and strength of the affinity between itemsets.

Recommendations: Customers Who Bought This Also Bought

A primary use case of FPM algorithm is to deliver product recommendations to clients that have purchased or expressed intent to purchase an item. The challenge in doing this well in real-time is building infrastructure to process all the “bought together” sets instantly from millions of transactions. These suggestions can be implemented online or in mobile dynamically, in-store via clienteling apps or directly at the register, and even post transaction in email. For example, when a user browses a product page online, Amazon may suggest 100 related items (via 15 pages) improving relevancy and delivering a more convenient experience.

Frequent Pattern Mining Case Study in the Wild: Food Genius

Another noteworthy company gaining ground on detecting the patterns of big data is Chicago-based Food Genius, the leader in identifying restaurant menu trends and providing insights solutions to the food industry. We invited Eli Rosenberg, a founder of Food Genius, to share how they use big data and high scale computing to create new intelligence.

Food Genius is an industry leader in deriving insights from patterns across 50 million unique menu items. By using Food Genius, manufacturers and foodservice operators equip their innovation, marketing and culinary teams with the most relevant stats related to menu items, as well as actionable insights on why it matters to them. For example, Food Genius allows users to quickly discover the fastest growing proteins mentioned on pizza and how the use of certain descriptors on menus can justify an increase or decrease in menu pricing. With Food Genius the patterns uncovered are deep and nuanced. The entire platform is based on the actual occurrence of some 30,000 unique and hierarchical terms, comparable to the Excel columns representing different product SKUs. Each quarter, Food Genius updates its menu data, essentially adding millions and millions of data points 4 times per year, the interaction of these create remarkable patterns that can be tracked quarter over quarter or year over year.

Clients of Food Genius and SwiftIQ have an amazing resource for uncovering what dishes are being rendered for diners and how the changing landscape of what is offered will affect consumer decisions, buying patterns on premise and in the grocery store. Check out Food Genius to get a hand on demonstration.

Overview of SwiftIQ

SwiftIQ identifies patterns from large, complex data to deliver intelligence without any human intervention. Swift Access aggregates data from unlimited sources into a single, unified backend. For order data, it can organize and store detailed transaction logs or web sessions in real-time or nightly. Swift Predictions offers a catalog of data mining and predictive algorithms as APIs, an insights interface and visualization tools. Clients can generate and access these insights through a user interface or via APIs. For more information on how you can realize the value of point-of-sale data, download our free eBook. If you would like to schedule a demo to observe SwiftIQ in action, please contact us today.