Wednesday, January 4, 2017

Collection, surveillance, analysis, prediction: there are reasons why the battle between freedom and slavery will take place on the Internet. Only in the past decade did big data enter the headlines, because the necessary hardware and storage capacity became affordable for corporations. In addition, governmental and corporate data crunching capability improved to enable what panelists at Financier Worldwidecall, "curation ... of enormous data sets" and "the ability to predict when a certain business-contextual event is about to happen, and then to adjust accordingly in an automated fashion."

Few people read the fine print when they sign up for social media accounts, so they do not understand how others now own their personal identities and seek to decide their fates. Nor do they understand how the Internet of Things forms a network of physical objects around them to glean and mobilize information. From Radio New Zealand:

"I was on Facebook recently and I realised they were showing me a photo that wasn't already on my newsfeed and that I wasn't even tagged in, that had come from my camera roll."

"Today virtually everything we do is monitored in some way. The collection, analysis and utilization of digital information about our clicks, swipes, likes, purchases, movements, behaviors and interests have become part of everyday life. While individuals become increasingly transparent, companies take control of the recorded data."

Mozilla, developers of the Firefox browser, developed Lightbeam so you can see who is tracking you while you browse. Privacy Lab has made available online a 2016 book by Wolfie Christl and Sarah Spiekermann: Networks of Control: A Report on Corporate Surveillance, Digital Tracking, Big Data and Privacy (Hat tip and thanks: Janine Römer). The book explains how social control through big data actually works, and it is far more evil, insidious and Darwinian than one would imagine, because algorithms target individuals' socio-economic performance in life to create new kinds of discrimination. When you state what you are doing or thinking on Facebook or Twitter, when you surf the Web, when you buy things, travel, or read certain news stories, you are letting the world know how successful you are or are not, by other people's mechanized standards:

"Today, a vast landscape of partially interlinked databases has emerged which serve to characterize each one of us. Whenever we use our smartphone, a laptop, an ATM or credit card, or our ‘smart’ TV sets detailed information is transmitted about our behaviors and movements to servers, which might be located at the other end of the world. A rapidly growing number of our interactions is monitored, analyzed and assessed by a
network of machines and software algorithms that are operated by companies we have rarely ever heard of. Without our knowledge and hardly with our effectively informed consent, our individual strengths and weaknesses, interests, preferences, miseries, fortunes, illnesses, successes, secrets and – most importantly – purchasing power are surveyed. If we don’t score well, we are not treated as equal to our better peers. We are categorized, excluded and sometimes invisibly observed by an obscure network of machines for potential misconduct and without having any control over such practices.

While the media and special interest groups are aware of these developments for a while now, we believe that the full degree and scale of personal data collection, use and – in particular – abuse has not been scrutinized closely enough. This is the gap we want to close with the study presented in this book."

Thus, the debate around big data focuses on post-2013, post-Snowden ideas: privacy or anonymity; predictive marketing; social control; totalitarianism. Yet Utopia or Dystopia recognizes that big data are so superhuman in quantity that they blur reality:

"Big Data; does it actually provide us with a useful map of reality, or instead drown us in mostly useless information? ... [D]oes Big Data actually make us safer? ... [H]ow is the truth to survive in a world where seemingly any organization or person can create their own version of reality. Doesn’t the lack of transparency by corporations or the government give rise to all sorts of conspiracy theories in such an atmosphere, and isn’t it ultimately futile ... for corporations and governments to try to shape all these newly enabled voices to its liking through spin and propaganda?"

Instead of big data driving fears of exploitation and totalitarianism, this concern revives far older contests between rationality and the unknowable.

Bodies of big data are so big that they become a kind of big mind, a combined collective consciousness and collective unconscious. To account for virtual reality by known means is impossible. Academic history as we knew it, 15 years ago, cannot now be written according to traditional methods and new methods must be developed. The body of data is: (a) too vast to be processed by a human; (b) unfixed: potentially subject to infinite alteration; and (c) stored in languages and on devices which rapidly become obsolete.

The same goes for the social sciences. Try to analyze the online kekkism in the recent American election and be prepared to confront something akin to magic which will defy current theories. The great modern experiment to rationalize the world breaks down in the face of anti-rationality, hacking, and Underground cryptics, whether by anonymity and encryption, or by mysterious forms of communication, behaviour and awareness, which will surpass knowledge and understanding. Big data erode reality, and this is why the ISIS publicity bureau and magazine can promote an apocalyptic eschatology unironically in this day and age. When you are operating in an environment where X zillion bits of data are being created every second, an apocalypse seems appropriate to some, and makes more sense.

"When it comes to human activities, algorithms are expected to be models of objectivity, owing to their basis in mathematical formulae and reliance on enormous quantities of measured facts about a given general population, whether students or teachers, job applicants or criminal defendants. Cathy O’Neil makes the case that real-world mathematical models are anything but objective. ... [S]he asserts that big data WMDs are opaque, unaccountable and destructive and that they essentially act as unwritten and unpublished secret laws."

"[The Panama Papers] should ... serve as a stark reminder of the hidden value sitting locked in large amounts of unstructured data, such as notes, documents and emails.

In recent years, we’ve seen businesses in many industries solve the puzzle of big data and begin to extract the insights that can accelerate innovation and grow revenue. Healthcare, finance and retail are three that immediately come to mind that are at the forefront of using big data. But that is only the beginning.

Consider this: 90 percent of the world’s data only came into existence in the last two years. With more of our lives moving online and into the cloud, this remarkable growth of data will only accelerate, offering enormous possibilities to the businesses that can navigate these massive data collections.

The Panama Papers are a roadmap. It is now possible to collect and analyze data faster than ever before through the use of unparalleled computing power and machine learning methods, such as deep learning. Unstructured data, such as the text in the posts and messages of social media that most of the world uses, emails that were leaked or subpoenaed, laboratory notes or technical documentation, represent a massive opportunity for businesses that can harness it. ...

Andy Grove, retired CEO of Intel Corp., calls this moment in potential growth a 'strategic inflection point' — the point at which two major pathways temporarily coincide — between doing business as usual, or embracing and adapting to the new."

Caption for the above image: "The volume, variety and velocity of data coming into your organization continue to reach unprecedented levels. This phenomenal growth of data requires that you gain valuable insights from your big data, regardless of where it is stored. It is becoming critical for organizations to seamlessly report, analyze, and monitor data which is unstructured in various forms like text, audio, video from multiple sources such as relational data warehouses, datamarts, multi-dimensional databases, web services, internet, social media and Salesforce.com and get the best-in-class business insights with the highest performance. We provide strategic advice that includes organizational assessment, defining big data strategy, requirements analysis, platform and tool selection, architecture design, application design and development, testing and deployment."

I suggest that Andy Grove did not fully understand that big data's strategic inflection point - the convergence between old and new ways of doing things - could mean something different from what he intended.

Initially, sure: a big data promoter would say that big data no longer comprise a huge body of information which exploits old privacy boundaries. Big data create potentials in new societies and in globalized, programmable economies. In this view, potentials will transcend our understanding of conventional society, economics, health, and politics, to create exciting new modes of existence.

In other words, when one discusses big data, one sees conflicting opinions. Some see totalitarianism. Some see Darwinian social media misrule. Some see surveillance and intrusion. Some see profit and opportunity. All agree that the way we live is transforming beyond recognition. I argue that there are moderating considerations and historical continuities at work.

There are moderating considerations which emerge from new ways of being, living and working in relation to big information. First, there is an idea of collectivism, of being a tiny part of something enormous, of being a droplet in the cloud. Information is one of several mechanisms or structures used to build frameworks of control around the individual and around society. Nevertheless, those mechanisms depend on larger, natural cultural expressions and practices. A society will always be bigger than its instruments of control, as will the ways people accept their positions in society. Destroy individuals' incentives to conform by mechanizing those incentives through big data, and societies will evolve beyond big data.

Another moderating consideration is whether we might fall out of love with technology, or begin to treat it differently. We will only enable big data insofar as we indulge the 18th century obsession with rationalizing the world. It is possible to reevaluate that Enlightenment value system. That is not necessarily an anti-rational or anti-tech argument. We can reexamine rationalism and stop automatically accepting its underlying moral and emotional dictates; we can stop automatically assuming that rationalism equates to science and technology. These are blind spots.

We can uphold rationalism with greater logical consistency and integrity. For example, apps which support big data collection ironically depend on technophiles' technological ignorance. Teach people programming, and they might be more mindful when they use technology. I have previously written on the way Apple dispensed with technological knowledge as a prerequisite for tech use. Steve Jobs intentionally turned gadgets into intimate prostheses and semi-magical-sex toys, which fed his creed of egotism: the iPad, iPhone, iPod, etc. Me, me, me, me and my technology. Technology did not and does not have to be about egotism, self-indulgence, élitism, and mesmerizing, pacifying self-deception and self-enslavement, a real world lost in virtual mirrors. Alternatives to this vision were developed in the past, and could be attempted again. See my related post: Farewell and Hello, Commodore.

In addition, the evolutionary time frame of the big data ecosystem is longer than expected. To analyze big data, one might partly ignore the narratives of technology and newness. A working group at the Max Planck Institute for the History of Science, Historicizing Big Data, including Elena Aronova, Christine Oertzen, and David Sepkoski, maintains that the love of the 'bigness' of information is not new. It runs back two centuries:

"Since the late 20th century, huge databases have become a ubiquitous feature of science, and Big Data has become a buzzword for describing an ostensibly new and distinctive mode of knowledge production. Some observers have even suggested that Big Data has introduced a new epistemology of science: one in which data-gathering and knowledge production phases are more explicitly separate than they have been in the past. It is vitally important not only to reconstruct a history of 'data' in the longue durée (extending from the early modern period to the present), but also to critically examine historical claims about the distinctiveness of modern data practices and epistemologies. ...

We take for granted, for example, that a history of data depends on ... the practices and technologies that support it: not only are epistemologies of data embodied in tools and machines, but in a concrete sense data itself cannot exist apart from them. This precise relationship between technologies, practices, and epistemologies is complex. Big Data is often, for example, associated with the era of computer databases, but this association potentially overlooks important continuities with data practices stretching back to the 18th century and earlier. The very notion of size—of 'bigness'—is also contingent on historical factors that need to be contextualized and problematized. We are therefore interested in exploring the material cultures and practices of data in a broad historical context, including the development of information processing technologies (whether paper-based or mechanical), and also in historicizing the relationships between collections of physical objects and collections of data. ...

The term 'Big Data' invokes the consequences of increasing economies of scale on many different levels. It ostensibly refers to the enormous amount of information collected, stored, and processed in fields as varied as genomics, climate science, paleontology, anthropology, and economics. But it also implicates a Cold War political economy, given that many of the precursors to 21st century data sciences began as national security or military projects in the Big Science era of the 1950s and 1960s. These political and cultural ramifications of data cannot be separated from the broader historical consideration of data-driven science.

Historicizing Big Data provides comparative breadth and historical depth to the on-going discussion of the revolutionary potential of data-intensive modes of knowledge production and the challenges the current 'data deluge' poses to society."

Another moderating thread is a theme of control, in which industry experts observe information being exponentially generated, and they can almost taste the potentials. But they cannot invent analytical systems quickly enough to capitalize on those potentials and indulge the temptation to exploit the information. Bodies of data grow faster than the mechanisms designed to profit from them. In short, there is a point at which the big data capitalist, big data spy, or big data manipulator, must compromise. From this gap around potentials and desires arises talk of models, estimators, and trade-offs, as mentioned by Alekh Agarwal at UC Berkeley in 2012:

"The past decade has seen the emergence of datasets of an unprecedented scale, with both large sample sizes and dimensionality. Massive data sets arise in various domains, among them computer vision, natural language processing, computational biology, social networks analysis and recommendation systems, to name a few. In many such problems, the bottleneck is not just the number of data samples, but also the computational resources available to process the data. Thus, a fundamental goal in these problems is to characterize how estimation error behaves as a function of the sample size, number of parameters, and the computational budget available."

These themes betray an underlying continuity. If big data are not a factor of technological newness, but are the latest manifestation in a historical process of modern societal evolution - and big data concern how that modern societal evolution is subject to changing modes of control, then we see that big data are part of an ongoing story of how cultural change relates to political control.

Big data's strategic inflection point is the point at which ways of being in relation to big information respond to control, and then slip just outside that control's crushing, mechanized frameworks and conformist boundaries. It is a moment of irony, the point at which our computing action transcends our computing power. And that is the point where potentials are achieved beyond the impulse and ability to grasp what they mean.

Clip from BBC's One Life (2011). Reproduced under Fair Use. Video Source: Youtube.

Big data's strategic inflection point reminded me of one of my favourite recent nature documentaries, BBC's One Life(2011). The best moment in that documentary, in my opinion, was the bit about the pebble toad (Oreophrynella nigra). The pebble toad has evolved to spend all its time climbing up mountains with little suction cup feet, except when it is about to be eaten by a giant tarantula, the local predator. When in danger, the pebble toad lets go, curls up in a ball, and rolls all the way back down the mountain. This resembles the elusiveness of human life inside the matrix, and reminds us that no matter how technologically-defined we are, we are still natural beings. There will always be something about human beings which will fall away, just as the algorithms are about to clamp shut. As if in innate recognition of this, artists have begun to make pieces about big data and surveillance (below).

Caption for the above image: "This installation is one of several in the collection Instagram Cities by Damon Crockett, where photographs that were posted on Instagram, are divided into 16X16 pixel parts and organized by average hue and brightness. The collection gives an insight into cultural life in the individual cities, such as which metropolis has a high frequency of nighttime photography."

The Atlantic (8 April 2016): How Big Data Harms Poor Communities: Surveillance and public-benefits programs gather large amounts of information on low-income people, feeding opaque algorithms that can trap them in poverty

IB Times (26 April 2016): How China uses mass surveillance and big data snooping to curb social unrest

Financier Worldwide (April 2016): FORUM: Use of Big Data and data analytics as part of a risk management strategy

BBC (18 August 2016): US ready to 'hand over' the internet's naming system

The Economist (30 September 2016): Why is America giving up control of ICANN?

Independent (22 October 2016): China wants to give all of its citizens a score – and their rating could affect every area of their lives: The Communist Party wants to encourage good behaviour by marking all its people using online data. Those who fall short will be denied basic freedoms like loans or travel

techdirt (27 October 2016): Alibaba's Boss Says Chinese Government Should Use Big Data Techniques On Its 'Citizen Scores' Surveillance Store

Independent (31 December 2016): Investigatory Powers Act goes into force, putting UK citizens under intense new spying regime: "From the end of 2016, every British citizen is living under spying powers that have been deemed 'world-leading – but only as a beacon for despots everywhere'"

QUEEN'S UNIVERSITY SEMINAR SERIES, Surveillance Studies Centre, Queen's University, Ontario, Canada (October 2016 - March 2017): A lunchtime seminar series showing how Big Data is used, and debated, on campus. Related: "In the world of big data surveillance, huge amounts of data are sucked into systems that store, combine and analyze them, to create patterns and reveal trends that can be used for marketing, and, as we know from former National Security Agency (NSA) contractor Edward Snowden’s revelations, for policing and security as well. This project views neither big data nor surveillance as ‘good’ or ‘bad,’ but nor are they neutral. Big data surveillance has consequences, opening opportunities and shutting them down. The once-limited leaks of personal data have rapidly become a torrent and opting-out is less and less possible. Big data promises to further transform the ways that information and power are intertwined. Today, vast datasets of personal information are assembled and analyzed in unprecedented ways and novel domains. They prompt fresh queries about privacy, social sorting and civil liberties. Enthusiasm for big data techniques and practices has opened the door to mass surveillance as the main means of monitoring and tracking populations in order to manage and influence them."