Autonomy, technology and prediction I: some conceptual remarks

“How would you feel if a computer could predict what you would buy, how you would vote and what kinds of music, literature and food you would prefer with an accuracy that was greater than that of your partner?”

Versions of this question has been thrown at me in different fora over the last couple of months. It contains much to be unpacked, and turns out to be a really interesting entry into a philosophical analysis of autonomy. Here are a few initial thoughts.

We don’t want to be predictable. There is something negative about that quality that is curious to me. While we sometimes praise predictability, we then call it reliability, not predictability. Reliability is a relational concept – we feel we can rely on someone, but predictability is something that has nothing to do with relationships, I think. If you are predictable, you are in some sense a thing, a machine, a simple system. Predictable people lose some of their humanity. Take an example from popular culture – the hosts in Westworld. They are caught in loops that make them easy to predict, and in a key scene Dr Ford expresses his dislike for humanity by saying that the same applies to humans: we are also caught in our loops.

The flip side of that, of course, is that noone would want to be completely unpredictable. Someone who at any point may throw themselves out the window, start screaming, steal a car or disappear into the wilderness to write poetry would also be seen as less than human. Humanity is a concept associated with a mix of predictability and unpredictability. To be human is to occasionally surprise others, but also to be relied upon for some things.

To be predictable is often associated with being easy to manipulate. The connection between the two is not entirely clear cut, since it does not automatically follow from someone being predictable that they can be manipulated.

One way to think about this is to think about the role of predictability in game theory. There are two perspectives here: one is that in order to make credible threats, you need to be predictable in the sense that you will enforce those threats under the circumstances you have defined. There are even techniques for this – you can create punishments for yourself, like the man who reputedly gave his friend 10 000 USD to donate to the US national socialist party (a party the man hated) if his friend ever saw him smoking. Commitment to a cause is nothing else than predictability. Following Schelling, however, a certain unpredictable quality is also helpful in a game, when the rational thing to do is what favors the enemy. One apocryphal anecdote about Herman Kahn, who advocated thermo-nuclear war as a possibility – was that he was paid to do this as to keep the Soviets guessing if the US really could be that crazy to entertain the idea of such a complete war. In games it is the shift between predictability and unpredictability – the bluff! – that matters.

But let’s return to the question. How would we feel? Would it matter how much data the computer needed to make its predictions? Would we feel worse or better if it was easier to predict us? Assume it took only 200 likes from a social network to make these predictions – would that be horrifying or calming to you? The first reaction here may be that we would feel bad if it was in some sense easy to predict us. But let’s consider that: if it took only 200 likes to predict us, the predictions would be thin, and we could change easily. The prediction horizon would be short, and the prediction thin. Let’s pause and examine these concepts, as I think they are important.

A prediction horizon is the length of time for which I can predict something. In predicting the weather, one question is for how long we can predict it – for a day? For a few days? For a year? Anyone able to that – predict the weather accurately for a year – would have accomplished something quite amazing. But predicting the weather tomorrow? You can do that with 50% accuracy by saying that tomorrow will be like today. Inertia helps. The same phenomenon applies to the likes. If you are asked to predict what someone will do tomorrow, looking at what they did today is going to give you a pretty good idea. But it is not going to be a very powerful prediction, and it is not one that in any real sense threatens our autonomy.

A prediction is thin if it concentrates on a few aspects of a predicted system. An example is predicted taste in books or music. Predicting what you will like in a new book or a new piece of music is something that can be done fairly well, but the prediction is thin and does not extend beyond its domain. It tells you nothing about who you will marry or if you will ever run for public office. A thick prediction is cross domains and would enable the predictor to ask a broad set of questions about you that would predict the majority of your actions over the prediction horizon.

There is another concept that we need as well. We need to discuss prediction resolution. The resolution of a prediction is about the granularity of the prediction. There is a difference between predicting that you will like Depeche Mode and predicting that you will like their third album more than the fourth, or that your favorite song will be “Love in itself”. As resolution goes down, prediction becomes easier and easier. The extreme case is the Keynesian quip: in the long run we are all dead.

So, let’s do back to the question about the data set. It obviously would be different if a small data set allowed for a thick, granular prediction across a long horizon or if that same data set just allowed for a short horizon, thin prediction with low resolution. When someone says that they can predict you, you need to think about which one it is – and then the next question becomes if it is better if you have a large data set that does the same.

Here is a possibility: maybe we can be relaxed about thin predictions over short horizons with low resolution based on small data sets (let’s call these a-predictions), because these will not affect autonomy in any way. But thick predictions over long horizons with high resolution, based on very large data sets are more worrying (let’s call these b-predictions).

Here are a few possible hypotheses about these two classes of predictions.

The possibility of a-predictions does not imply the possibility of b-predictions.

Autonomy is not threatened by a-predictions, but by b-predictions.

The cost of b-predictions is greater than the cost of a-predictions.

Aggregated a-predictions do not become b-predictions.

a-predictions are necessary in a market economy for aggregated classes of customers.

a-predictions are a social good.

a-predictions shared with the predicted actor change the probability of the a-predictions.

There are many more possible hypotheses worth examining and thinking about here, but this suffices for a first exploration.

b-predictions will become the basis of “nudging” in advertising, education, healthcare, and (hmmmm) politics. This is where we will need algorithmic auditing and transparency and a competetive market in order to be able to make our own decisions about which direction we want to be nudged. My smartwatch already knows when I need to get up do some exercising.

I guess a fear I would have is that the a-predictions would be used to wall me off from other experiences that could potentially serve to broaden my horizons. Music and books might be, as you point out, a restricted domain that does not extend over others, but it could still be an integral part of my life that I would rather have broadened than just iteratively deepened. On the other hand, that kind of deepening might just deter me from using the recommendation service, as I know Ben Ratliff has pointed out with regard to Spotify.

Also, I’m curious about what the theoretical limits for b-predictions would be. I’m at the moment reading Philip Tetlock’s “Superforecasting”, and it is clear that the more specific information you have about the event whose possibility you’re trying to forecast, the better. Can broad prediction really be done with high resolution, or does it necessitate lower resolution as the “domain of prediction” grows wider?

About this site

This is where you find the unfinished thoughts and sketches, this is where we can debate and discuss. This is where the marginal notes, the hunches and investigations end up. This is where I am often wrong. This is where I speak only for myself, and sometimes not even that – I change my mind fairly often.