CS2514 Lab 6

Exercise

movies.txt
This file contains 10 movies, each with a unique id (from 0) and a title.

customers.txt
This file contains 10 customers, each with a unique id (from 0) and a name.

ratings.txt
Each line of this file has 3 numbers: a customer id, a movie id and a star-rating (from 1 to 5).
For example, the first line of the file, 0|0|5, shows that customer 0 has given
5 stars to movie 0; the last line of the file, 9|8|4, shows that customer 9 has
given 4 stars to movie 8.

To recommend a movie to a customer, you score all the movies s/he has not already
rated, and recommend the one with the highest score. (If more than one movie has joint
highest score, then pick the first.)

How do you score the movies? Consider the people who have rated the movie. Take a
weighted average of their star ratings for that movie. That's the score. Here it is in
symbols. Suppose we want to recommend a movie to customer c. We score each movie
m that c has not rated using the following:
$$score(c, m) = \frac{\sum_{x \in S} w(c, x) \times r(x, m)}{\sum_{x \in S} w(c, x)} $$
where $S$ is the set of people who have rated movie $m$ and $r(x, m)$ is the number of stars
$x$ has given to $m$. (If $\sum_{x \in S} w(c, x)$ is zero, then the score is zero.)

What do we use for the weights, $w(c, x)$? We use the inverse of the Euclidean distance:
$$w(c, x) = \frac{1}{1 + \sqrt{\sum_{m \in M} (r(c, m) - r(x, m))^2}} $$
where $M$ is the set of movies that $c$ has rated but that $x$ has also rated, i.e. the
movies they have in common.

Your program must contain a main method in a class called
MovieRecommenderTester, but otherwise you decide what other classes you want.

If you wish, you can assume that there will be 10 customers (0-9) and 10 movies (0-9),
allowing you to use arrays of size 10. However, you can gain more credit by being more
general and allowing arbitrary numbers of customers and movies, which makes arrays less
suitable. (Note though that this is quite a bit more difficult.)

The number of ratings per customer can vary and so storing each customer's ratings in arrays
will not be the best solution. (In real recommenders, there are many customers and movies
but relatively few ratings per customer, making arrays even less suitable.) So you will be
thinking about using, e.g., lists for these.

The files are text files. To read these in, you will need a FileReader and,
optionally, a BufferedReader.

When reading in the data from the files, you'll need to convert Strings to
ints. This is done using Integer.parseInt, which is a class
method. It will need to be inside a try with a catch for
a NumberFormatException.