Wednesday, January 8, 2014

Netflix and Its People Problem

Felix Salmon recently wrote an excellent piece detailing
some current problems with the Netflix model. Studios have all the bargaining
power, anytime they see big profit numbers generated by streaming providers
they can just demand higher prices for their content. This is why you see
Netflix and HBO running towards creating their own content to try and escape
this intensifying bidding war.

As a result of these bidding sprees, Netflix has begun
to lose out on content quality. To rectify this, Netflix’s
recommendation algorithms has had to get more sophisticated to try and
determine preference patterns amongst a landscape devoid of quality. Without
high quality content, Netflix now has to grope around a dark room of content,
using touch and feel in lieu of the more accurate vision, leading to bumps,
bruises and constantly recommending Iron Man 2.

This approach runs into two seemingly related problems.
First, as my previous post eluded to, individuals do not have innate preferences
for many goods and experiences. Let’s say someone described everything there
was to know about ice cream, from the sweet sensation of the cream as it melts
on your tongue, to the molecular structure of cream, sugar and ice particles.
Would you then be able to predict whether you would like it or dislike it? Well
I like cold things, like snow, but ice cream isn’t really snow. No I like milk
a lot, but what about that sugar and those flavorings. Would your enjoyment of
the individual parts of ice cream guarantee that you were going to love ice
cream? This is what Netflix’s new recommendation system is betting on, I can
tease apart differential aspects of movies and triangulate stable preferences.
Unfortunately, very subtle things can change experiences of events.

A now famous study by Dan Ariely highlights the malleability
of experience.
He opened a class with a brief reading of a Walt Whitman poem, and told
students he would be doing a few short poetry readings one evening. The class
was then split into two groups. The first group was told the cost of the show
would be $10, and asked if they would accept that and what they would be
willing to pay to see the show. The second group was told that they would be
paid $10 to see the show, and then asked the same set of questions. The first
group said they would be willing to pay $1 to $5 dollars versus the second group
who said they would be willing to go if receiving $1 to $4 dollars. Note that
the group that had been asked if they would pay $10 could have asked to be paid
to go to the show, but they did not. Preferences for goods, especially
experiential goods, are highly context dependent.

I’m not denying there exists certain dispositional tastes. I
habitually watch sci fi movies. I have seen more outer space prison break
movies than movies made before 1960. But that is a whole different ballgame
than presuming it is something as specific as "Foreign Satanic Stories from the 1980s." I like many romantic comedies from the late 80s and early 90s, but to say this links Say Anything and Pretty Woman seems a bit of a stretch.

This strategy runs into
a second problem, overfitting. When sampling from complex,
feedback driven systems the model used must be very robust to future
deviations. Gerd Gigenrenzer gave a great talk on the robustness of simple
models. The two graphs shown below show a problem with this phenomenon.

The first
graph shows two different models fitted to yearly temperature data. One model
is a 12th degree polynomial, whereas the other is a 3rd degree polynomial. As
can be seen the 12th degree polynomial has lower error, it fits the
temperature data better. However, as seen from the second graph, the predictive
ability of the 12th degree polynomial is much worse. The more one attempts to boil complex systems to
singular, large algorithms the more problems one runs into out-of-sampling.

Now it seems I might be painting myself into a corner. I
have both stated that simple models predict complex systems, while
simultaneously acknowledging that simple stimulus-response systems such as
Netflix are insufficient in the face of this complexity. What I am trying to show is that there is no
one algorithm to rule them all. How can simple models be used in the human
realm? Utilizing our already existing pattern recognition devices, other
people’s brains. By supplanting these algorithms with expert human judgment, we
are able to see increased success. Salmon writes another article on exactly this subject:

“Nate Silver himself has written thoughtfully about examples
of this in his book,The Signal and the Noise. He cites baseball, which in the post-Moneyball era adopted a “fusion approach” that
leans on both statistics and scouting. Silver credits it with delivering the
Boston Red Sox’s first World Series title in 86 years. Or consider weather
forecasting: The National Weather Service employs meteorologists who,
understanding the dynamics of weather systems, can improve forecasts by as much
as 25 percent compared with computers alone. A similar synthesis holds in eco­nomic
forecasting: Adding human judgment to statistical methods makes results roughly
15 percent more accurate. And it’s even true in chess: While the best computers
can now easily beat the best humans, they can in turn be beaten by humans aided
by computers.”

He gives short shrift to where I believe this might be most
ground-breaking, in the experiential good market. There is already a model for
supplanting algorithmic results with human interaction, and that is “A BetterQueue.” It’s a very simple website that links Rotten Tomatoes and Netflix. The key here is
that a person can pre-select categories and rating criteria that will pass
through their filter. We have not only human reviewers as a filtering scheme,
but also the person who will be experiencing the good itself. I think this is
why online dating sites are so successful, as well. For all the hype of the
algorithms used to generate potential matches, these systems have that ultimate
backstop in that people will generally have to talk to one another before any
real commitment is made. So while an algorithm gets people to the door, it is a
person who is tasked with figuring out if this is the right one for them. These
are types of systems I believe will be key to these markets developing in the
future. Ignoring how human judgment and relationships helps to create a good
fit for individuals will mean these systems will continue to be sub-optimal
recommendation schemes.