Algorithmia Blog - Deploying AI at scale

Mining Product Hunt Part II: Building a Recommender

This blog post is Part II of a series exploring the Product Hunt API. Apart from the Product Hunt and Algorithmia API we use Aerobatic to quickly create the front end of our apps.

Our last post discussed how we acquired the Product Hunt dataset, how we designed a vote-ring detection algorithm, and ended with a live demo that analyzes new Product Hunt submissions for vote-rings. In this post, we will briefly explain how collaborative recommendation systems work (through the FP-Growth algorithm) and will apply that on the Product Hunt dataset using a live demo. As a cherry on top, we made a Chrome extension that will augment your Product Hunt browsing with a list of recommendations using our approach. Read on for the results, demo, and code.

Note About Recommendation Engines

There are two broad approaches to recommendation engines: those based on collaborative filtering and those based on content based. A hybrid approach marrying the two is also very common.

Collaborative Filtering recommenders depend on the way that users interact with the data – the number of overlapping decisions between users A and B (such as liking a post or buying an item) is used as an indicator to predict the likelihood that user A will make the same decision as user B on future topics. This is similar to Amazon’s “Other users also bought…”.

Content Based recommenders depends on the inherent attributes of an item – if user A likes item X, then the description/keywords/color of item X is used to predict the next item to recommend. This analysis can be extended to build a profile about what user A likes and dislikes. This is similar to Pandora’s Music Genome Project.

Here we are going to build a Collaborative Filtering recommendation engine that is specific to Product Hunt. There are many approaches to achieve this task and we will use the most straightforward one: Affinity Analysis (a variant of Association Rule Learning).

Understanding Association Rules

Businesses realized the power of Association Rules long time ago, especially for up-selling and cross-selling. For example, a grocery shop might notice that people who buy diapers are also very likely to buy beer (i.e. strong association), and therefore the grocery shop manager might decide to place the diapers close to the beer fridge to fuel impulse buying.

We looked into Algorithmia’s catalogue and we found two algorithms providing exactly this function: Weka-based implementation by Aluxian, and direct implementation by Paranoia. They were both implementing the Frequent Pattern Growth algorithm (FP-Growth), which is a popular method to build association rules through a divide-and-conquer strategy. We ran them side-by-side and we decided to go with Paranoia’s implementation.

The algorithm takes a set of transactions as an input and produces weighted associations as an output. Each transaction is represented as a single line with comma-delimited items. Instead of customers and groceries, our transaction set was the upvotes that Product Hunt users made on all the 16,000+ posts. Each user was represented as a line, and each line contained the posts’ ids that received an upvote from that user.

We Have a Demo

We created a Product Hunt-specific wrapper around the FP-Growth algorithm, which we called Product Hunt Recommender. This algorithm has a copy of the Product Hunt dataset and updates it every 24 hours (therefore it works better on posts more than few days old). It takes a single input (post id) and returns up to five recommended posts.

If you already have an Algorithmia account, you should be able to experiment directly with the algorithm through the web console (only visible if you’re signed in).

It was extremely satisfying to see how well the algorithm worked – for example, notice the result of applying the algorithm on the post ‘The Hard Thing About Hard Things’ (a book about entrepreneurship) gives a recommendation of four other books, all in entrepreneurship as well.

In a real-world scenario, a developer would run the FP-Growth algorithm on their dataset every so often and save the association rules somewhere permanent in the backend. Whenever an item is pulled out, the app would also look for strong associations as recommendations. Keep in mind that there are other routes towards a recommendation engine, such as content-based, clustering, or even hybrid solutions.

Product Genius, the Chrome Extension

We thought it would be awesome to make a Chrome extension out of this that adds a new section to the sidebar of Product Hunt posts, which we titled “Other People Also Liked”. There’s already a “Similar Products” section that is built in within Product Hunt and we can’t definitively determine what method they used to implement it. One thing’s for sure: from our limited testing, we found numerous posts where Product Genius returned better results than the built-in version.

We also found instances where Product Hunt’s internal recommender performed better: BitBound.

What Else Can We/You Do?

You can access the code for the Chrome extension, web demo, and the dataset itself from here. You can easily experiment with other approaches using the hundreds of algorithms within the Algorithmia catalogue, such as using Keyword-Set Similarity instead of FP-Growth or a hybrid approach using the two. Let us know if you have any ideas for other algorithms you want us to demonstrate on the dataset by tweeting us at @algorithmia.

Want to create your own analysis? Follow this Sign-up and get 100,000 credits on us – a special Product Hunt promotion.