Question/Answers @ RecSys Doctoral Symposium 2008

I came across an interesting blog post by @HDrachsler, who I started following on twitter after this year’s RecSys conference. The post contains a recording of the question/answer time at the RecSys doctoral symposium (which I unfortunately did not attend). The clearest voice in the recording is Prof. Joseph Konstan, who (obviously, I know) has some very interesting things to say about collaborative filtering, recommender system research, and the state of the field. Here are some notes that I jotted down while I was listening:

As before, what is a recommender system? A personalised system for people who have an excess of choice, or more alternatives than can be manually reviewed. In terms of research, there are overlaps with HCI, psychology, user modeling, AI, data mining, web services, knowledge management: in fact, there is more overlap than true centre in the field. Prof Konstan recommends the book “Nudge,” that looks at how to design the choices that somebody sees so that they make choices they are happier with, which may capture the spirit of what recommender systems aim to achieve.

The name “recommender system” was first proposed at a workshop on collaborative filtering; this is why there is an obvious bias toward collaborative filtering in the literature. Prof Konstan also makes an interesting point related to the structure of research institutions: the risk and time investment required to do something new and different makes it easy to keep pushing forward what has already been done (i.e. twisting an algorithm to be 0.001 better than the last one, rather than exploring other means of recommendation).

The initial view of collaborative filtering was that well defined user models were not needed- statistics and good machine learning methods would take care of the implicit relations in the data. We don’t need to know about our users, just about their preferences. However, it is not clear that statistics can and will outperform user modeling in all situations, especially given that we will never have perfect/infinite information about all users in the system.

Evaluating recommender systems: the question is “what measure should we be optimising toward?” This was discussed in a previous post; and here there are mentions of the confidence in seeing the best alternatives, optimising for diversity, and optimising for user experience. Prof Konstan cites some very interesting research he was involved in, where they measured the difference between prediction error and user experience. In their work, a difference of 0.3 stars between predictions/actual ratings was invisible to the user. We need to move away from improving prediction accuracy and fine-tuning our algorithms to suit particular environments and toward more interesting problems: dealing with less data, measuring quality difference, diversity, serendipity, confidence. Someone asks how they can show that their new method will make companies more money? Short answer: you can’t, but you can show how previous advances in this research have directly translated to revenue increases in companies.

The best way to learn about the above is to study real applications and systems. Can we study non-movie ratings? This way we will be able to find out where generalisations of previous work are wrong and revisit old approaches in new contexts. The lack of data is the main obstacle to this- perhaps this reinforces the strong relationship RecSys wants to keep building with both industrial and startup companies working in this area.

One example of a different domain is recommendations that have a time-dependency, both from the perspective of time-dependent items (news) and development-dependent items (such as recommending to wine as the users learn about it). These systems need to be able to predict changes in preferences, and react to patterns of change. I found this interesting as it reflects the implicit feedback loop in recommender systems (and more generic decision support systems) that is often not considered.

The “small recommender” problem. In the cases where very little data is available, processing it via algorithmic methods may not be the most suitable approach. Why not visualise it somehow, or get more data from other sources? Prof Konstan discusses buying datasets, releasing recommender systems after they have been calibrated with initial data, and getting knowledge of how the field and users work.