You Don't Even Need to Be on Social Media, The Companies Still Have Data on You

PETER DOCKRILL

26 JAN 2019

We've all thought about it. Maybe it's time to flick Facebook. Terminate Twitter. Silence social for good, and just be a person again.

Sadly, if this dream of going off the grid is about reclaiming your lost privacy, that might not actually be possible, according to new research.

A new study by researchers at the University of Vermont shows that social media posts by people you're connected with can actually be used to predict your own future posts – and even more accurately than if your own previous posts were being mined for insights.

"As few as 8-9 of an individual's contacts are sufficient to obtain predictability compared with that of the individual alone," the authors write in their paper.

"Our results have distinct privacy implications: information is so strongly embedded in a social network that, in principle, one can profile an individual from their available social ties even when the individual forgoes the platform completely."

To analyse the hypothetical predictability of the social echo chamber, the team culled over 30 million public tweets from 13,905 Twitter users.

This massive dataset was fed through computer systems called information-theoretic estimators – a form of machine learning that sifted through language data in the posts while accounting for the temporal order of user activities.

Using the data, the researchers identified 927 'ego networks', each representing one user (the ego) and their 15 most frequently mentioned Twitter contacts (the alters).

Within these ego networks, the estimators – once trained up on all the public tweets previously posted by the ego and the alters – were able to predict what the ego would write next about 60 percent of the time.

That figure might not sound too scary, but it's obviously significantly better than chance, reflecting the huge amount of personal information people unintentionally reveal on social networks.

Right now, the team says that 60 percent predictability is the current prediction limit in their method, but amazingly, once you take the ego's previous tweets out of the dataset – and only look at what the alters are saying – the predictability only drops to around 57 percent.

In other words, even if you're not on a particular social media platform at all, your so-called 'shadow profile', fuelled by the chatter and buzz of your nearest and dearest, can be predicted by machines.

According to computer scientist David Garcia from the Medical University of Vienna (who studies these kinds of things himself but was not involved in this particular study), these awe-inducing predictive abilities change the game entirely when it comes to protecting privacy.

"While we know that the information of an individual can be used to predict personal attributes and can be at odds with privacy regulations, the possibility of shadow profiles and social inference at scale points to a much larger problem," Garcia writes in a commentary on the new study.

"We need to stop thinking about individual privacy control and switch to a paradigm of networked privacy that takes into account that the decision to keep information private is affected by the decisions of others."