Redis as a recommendation engine

"You might be interested in red toilet paper, because you bought blue
and green toilet paper, and people who also bought blue and green
toilet paper tend to also buy red toilet paper".

Recommendation engines are now everywhere, from e-commerce to social networks.

Graphs databases like Neo4j are probably the best way to tackle the
problem. But as an alternative, let's try building a recommendation
engine on Redis.

Who bought what or who is following who could be stored in Redis
sorted sets.

Let's take an example: a user buys A, and also B. We increment the
score of the B item in the sorted set associated to key A. And we
possibly also do it the other way round.

This is a trivial way to keep a list of what items have been bought,
what are the related items and how many of them there are.

On a social network, the score associated to a relationship may reflect the
level of trust.

Let's suppose we have the following relationships:

B -> C
D -> C
E -> F
A -> B
A -> D
A -> E

C is referenced by B and D. B and D are referenced by A.
But C isn't referenced by A yet. This is presumably something to suggest.

F might also be a good candidate as a suggestion for A. However, since
there's only a single path from A to F (through E), it's probably less
relevant than C.

In order to limit the number of non-relevant results, we may want to
add a threshold: nodes that have less possible paths than a cutoff
value shouldn't be suggested.

How to do that with Redis, and only Redis?

Here's one possible way:

Use ZUNIONSTORE in order to build a temporary aggregate of all
related items (scores are summed up), and remove the reference item
from the set (cylic links).

Remove everything from the aggregate that is already directly
related to the reference item.
One way to achieve this on a sorted set is to use ZUNIONSTORE
AGGREGATE MIN with a negative or null weight for the set we want to
delete, followed by ZREMRANGEBYSCORE in order to actually remove
everything from the reference set and everything below the cutoff
score.