What’s the Matter with Data?

There may be some merit to this comparison. Both are ‘raw’ materials which are used to produce outcomes, be them plastics, fuel or targeted advertising, to name a few.

But this comparison fails because oil is finite, while data are not.

Firstly, where oil is consumed in the production process, data can be endlessly reused (debatably only becoming out of date).

Secondly, because we are constantly originating new information all the time, and collecting more of it in turn, the only bounds to our access to data are our imaginations, and the laws which we put in place.

Data clearly are valuable, with the five biggest corporations in the world (by market capitalisation) all working in the areas of tech, data, and computer science. However, data are valuable in an unusual way. Firstly, if data are limitless, then we might suppose the value of data should be zero.

Another reason why data are not like oil is data are less fungible (substitutable) than oil, or another example, money.

Money is an incredibly fungible asset because the value of money is always derived from the same guarantor. Thus, two £10 notes can be substituted for one another without causing any issue. The same is true of oil: provided the hydrocarbon molecules are the same, we usually don’t care which of the identical molecules we use.

Data, however, are not as fungible because they are usually about specific people, places and times.

For example, Facebook will probably derive more value out of my data when they use it to advertise a product to me, rather than someone else. That’s not to say my data isn’t valuable in my absence, only that the value of data (ignoring other factors such as access) is a function of the similarity between the data’s use and the subject whom the data are about.

This is all a long-winded way of saying data are not comparable to money or oil because, unlike oil, data are not finite, and unlike money, data are not (as) fungible. The outstanding question now is: does the digital economy treat data in accordance with these principles?

The Transactional Approach to Data

I consider ideas about how data are used in the digital economy political theories of data.

At present, a so-called barter or neoliberal approach to data is pervasive. Legal scholar Karen Yeung and academic Jose van Dijck use the word barter to describe where – according to van Dijck – “users provide personal information to companies and receive services in return.”

In my opinion, the concept of barter is flawed because it implies a negotiation between tech companies and users which is often absent. Equally, I worry the use of the word ‘neoliberal’ unnecessarily conflates this discussion with other debates.

Because of these concerns, I will instead use the word ‘transactional’ to describe what’s going on.

So, for clarity, the dominant political theory of data today is that users exchange their data with entities whom in turn provide users with free and additional services. Classic examples include Google and Facebook, both of whom provide services for ‘free’ but also capture enormous quantities of data.

This transactional concept has several problems.

For instance, if the ‘price’ of these ‘free’ services were merely access to data, the utilisation of data to produce targeted advertising demonstrates an inequity between parties (this is to say nothing of platforms like Uber or AirBnB which facilitate transactions while acquiring data, or Netflix or FitBit who have financial costs associated with their services).

It may be wise, therefore, to adjust the transactional concept to reflect the role of advertising. Yet, Facebook and Google could still advertise to users without collecting their data. If the exchange were actually advertising access for free services, data shouldn’t come into it.

If users really transact their data for free services, we might expect that users could stop using a service and receive their data back so that they may ‘shop around’ for a new service. This is the notion of interoperability – users should be able to take their data from one service to another without hindrance. Extending this rationale implies users should also be able to sell their data at a market rate instead of receiving services, and also purchase data at the same rate.

But this isn’t really what happens in the digital economy.

On the one hand, almost no one knows exactly what the likes of Facebook, YouTube or Google do with our data, including what datasets they produce. As such, the idea that we can simply take our data to another platform seems dubious. On the other hand, if we want to use a new platform, generally we just sign up and hand over new copies of our data (or rather, grant additional entities access to our information, from which data is produced). There is nowhere for us to ‘sell’ our data, and the notion of buying our data seems silly.

Image: social media icons

In the real digital economy, while at times we may appear to be engaged in a transaction, this concept quickly falls apart.

Finally, as Karen Yeung has argued in her work on ‘collective privacy,’ data are rarely wholly individual, and a lot of supposedly individual data can quickly reveal things about others who have not necessarily agreed to this transaction.

For instance, the American retailer Target famously predicted pregnancies, often before women and their partners were aware. Another example: in 2013, Facebook tracked non-Facebook users via third-party applications, compiling these data into user dossiers. While the company claimed all of this was the result of a bug which is now fixed, the incident demonstrates how data can be used to infer things about those who have not necessarily elected for inferences to be made. Yeung argues that because data are not solely about individuals, to characterise the digital economy as one of individuals transacting data in exchange for free services is false.

I think a good summary of all of this is provided by historian Lizzie O’Shea in Future Histories:

Companies collect data, rather than – as is often claimed – we give it away willingly. Both constructions of the process are technically true, in the sense that the collection of our data is impossible without our formal consent, but that provides only a very limited picture of the phenomenon. Consent is in no way meaningful when online spaces are designed around the expectation that it will be given and rarely offer users an active choice in how their data will be used or managed. It is as though obtaining consent were a mere formality, secondary to another purpose.

The Laissez Faire Model of Data Ownership

There are presently two outstanding issues to address.

Firstly, how can data be valuable if they are, essentially, infinite?

Secondly, if the transactional political theory of data doesn’t really mesh with the real digital economy, what alternative political theories might work better?

The first issue can be addressed by consulting history. A model of market emergence which law scholar and data ethicist Frank Pasquale has used in discussing data and platforms is one which argues markets generally emerge with the closing off public commons, or the privatisation of necessary or desired resources. While data are theoretically infinite, we still require some entity to observe actions, to collect information, and to make choices about that information.

Because there are costs to doing this (say, the cost of coding Facebook, or sending cars to map the world’s streets), any data that are collected tend to be held privately, in effect granting those who collect data a monopoly over both the uses and the access to data. By charging third-parties for these services (i.e. an advertising agency use of the data to target ads, or a third-party application access to data to develop their own services), data collectors can receive value.

Data become valuable because access to data become restricted.

I want to note two things for prosperity.

Firstly, under this model it is wholly inaccurate to say platforms sells data. Platforms do not sell data; they sell access to data.

Secondly, this model is congruent with academic Nick Srnicek’s concept of platform capitalism – where platforms act as intermediaries within data flows.

Anyway, I call this model of data ownership laissez faire, in the sense that anyone can collect data, and anyone can sell access to data, within the constraints of the law (i.e. GDPR) and economics (i.e. the cost of collection).

When we consider the sale of data access by the data collector (say, a platform like Facebook transacting with a brand), the transactional political theory makes a lot of sense. As academic Shoshana Zuboff argues, in the early days of advertising on Google, the platform essentially began to act a bit like a marketing firm, selling their digital prowess.

This is potentially the side of the coin that gives the transactional political theory its legs; but as above, from a user perspective, the idea of data being transacted doesn’t really stack up.

Furthermore, if – for example – users decide there is an inequity between what they receive for their data (a free service) and what platforms provide them with (a free service, and targeted advertising) and demand more, this would in turn demand a change in how value flows between users, platforms and third-parties.

In other words, if we change the political theory of data, we may also have to depart from the laissez faire model of data ownership. In my next blog post for this series, then, I will consider alternative political theories of data and turn my attention to two emerging models of data ownership: data trusts, and data commons.