For Us

Posts in: CP341

In the era of netflix, there’s a wide array of available films to watch online. One of the features that made Netflix so successful was its ability to recommend new movies. This ability is equivalent to answering the following questions:

If a person likes a particular movie, what are some similar movies? If a person likes a given set of movies (and rates them accordingly) what is a good estimate of their rating of another movie?

One method for answering these questions is called K-Nearest-Neighbors, or KNN. The way this works, basically, is that we use a function that calculates the ‘distance’ between two movies. Distance is is quotes because it’s not entirely clear how to do this – and in fact, there are multiple methods. Our method was to compare how similar the ratings were for those two movies over all users. So if movie1 was rated 3 by user1 and 4 by user2, and movie2 was rated 2 by user1 and 4 by user2, then the distance between movie1 and movie2 would be sqrt( (3-2)^2 + (4 – 4)^2) = 1.

Once we have all these distances (from a given movie), we just return some of the movies that had the lowest distances as our recommendations!

For our homework assignment yesterday, we had to write a program that performed this sort of analysis. Here’s an example of the output: