K/V5: Dark Data — Absence and Intervention

Concept note for the fifth workshop

The fifth meeting of the Knowledge / Value series was held at the University of Exeter (UK), on 15–16 December 2014, and explored the intersections between knowledge, value and dark data. Current discourse around data, and particularly scientific data and ‘big data’, are infused with the importance of the available, the pre-existing, the present. Data are givens, things that are and thus can be used as evidence; they are also tangible goods, the result of investments and labour, which need to be spread and used to improve human life and understanding. As one delves into actual ongoing attempts to handle, visualise, disseminate and interpret data, however, one realises that absence is at least as conspicuous as presence, and that it comes in different forms. Data are often missing, incomplete, unreliable, unobtainable, ignored or untagged. They can be hard to capture, store, perceive and disseminate, depending on their format, available technology and the degree of commitment and capital underlying these efforts. And sometimes, rather than providing evidence for what is there, they provide evidence for what is not.

Within this conference, we focused on the absences of data. The dark side of evidence — that which is not there, not readily available, not usable to prove claims or foster discoveries — brings us to confront questions concerning what does not constitute formalized knowledge — what is tacit, ignored, denied, forbidden, private, inaccessible, unknown and/or unexplored. While several scholars have broached aspects of this issue in relation to biotechnology, biomedicine and the bioweapons industry, ranging from studies of ignorance (McGoey 2012, Proctor and Schiebinger 2008), to studies of what is impossible to know (Gross 2012, Wynne 1992), inaccessible or secret (Rappert 2009, Balmer 2012), there had been no attempt to theorize the dark side of data, with their political, scientific, psychological, institutional and ethical origins and implications that take account of all these different aspects, and to bring them into systematic dialogue with one another.

We also hoped to explore the various resonances acquired by the notion of ‘dark data’ in contemporary public discourse. For instance, as the data imprints left by individuals and institutions grow, exploring how firms, corporations, governments, public health practitioners, and others can mobilise and capitalise on data already held, but not accessible is a matter of significant importance. In this sense, to bring data to light is to realise its value, raising questions of the constitution of infrastructures, institutions and social practices that govern consent, valuation and distribution. Similarly, the threats and opportunities of the so-called ‘dark internet’ signal the ways in which data is used, disseminated and valued within spaces whose institutional and social legitimacy is in question. By addressing the dark spaces of data, the conference attempted to conceptualise knowledge / value within a variety of scientific and technological contexts, as well as explore methodologies for intervening in these contexts of absence: How are the blank spots in emerging data landscapes understood and evoked, and what are the implications of making overlooked data and invisible practices more legible?

Aims and outline

The conference, titled “Knowledge/Value and Dark Data: Absences, Interventions and Digital Worlds”, was held at the University of Exeter on 15-16 December 2014. The main aims were to:

Conceptualize the meaning and place of 'dark data' in science, and through the comparative exploration of dark data and exploring what is purposely left unknown or left out in the production and use of data, distinguish between dark data, data absences, silences, negatives and the other dark matters of knowledge production;

Theorize ‘dark data’ as a general problematic of interest to the study of science and technology, especially in contexts of capitalization and securitization, and the production of more open and engaged relations between science and society;

Explore the potential to mobilize non-knowledge or the absence of knowledge in relation to action, exploring how non-knowledge can be acknowledged and attended to in ethnographic methods, experimental encounters, and the archiving of research practices and data in the arts, sciences and social sciences;

Reflect on alternative to calls for social scientists to join the rush to mine data, fill data gaps or capitalise on dark data and ask how we might rethink the implied relation between knowledge, value and absence.

The meeting comprised written papers and presentations open to all, with discussants leading on the development of the conceptual and methodological challenges for this topic of study. The workshop also featured the work of artist Neal White, to assist us in visualizing and staging relations, positions and concepts. As the very idea of ‘non-knowledge’ suggests, the issues at stake can be hard to capture in any positive, linguistic terms. It will be followed by a workshop for paper contributors, discussants and organisers exploring the links between this topic and the larger Knowledge / Value network.

Participants

The conference built on work at the University of Exeter, such as that by Sabina Leonelli, Brian Rappert, Steve Hinchliffe and Gail Davies, focusing on current policy and practitioner discussions about scientific data, ‘big data’ and questions of absence and the unknown. It included contributions from international collaborators. The participants included:

Rachel Ankeny (History and Politics, University of Adelaide)

Elena Aronova (History of Science, Max Planck Institute for the History of Science)

Brian Balmer (Science and Technology Studies, University College London)

Workshop Programme

Monday 15th December

9:30 -10:00 Introductions

Gail Davies

Sabina Leonelli

Brian Rappert

Kaushik Sunder Rajan

10.00 – 11.00 Paper session 1

Alison Wylie: Old Data Made New Again: How Archaeological Evidence Bites Back

Archaeological data are notorious for their gaps and absences, but they are not unique in this. The challenges of working with trace evidence are characteristic of the historical sciences generally and they arise, not just from the fragmentary, enigmatic state nature of data themselves, but from the paucity and instability of the inferential scaffolding on which we rely to interpret them as evidence. I propose to shift attention from the limitations of trace-based evidential reasoning – the focus of debate between epistemic optimists and pessimists – to an analysis of how it is that archaeologists routinely arrive at striking new insights by reanalyzing legacy data. I focus on three strategies. Most straightforwardly (philosophically, at least), innovative technical tools make it possible to extract new information from surviving traces; the scaffolding and the data base are both enhanced. In addition, existing data may be reanalyzed and integrated in ways that generate “conjunctive” evidence (from Taylor 1948); a contemporary example is the use of a “pragmatic Bayesian” approach to refine radiocarbon-based chronologies. Third, and more controversially, archaeologists increasingly build and manipulate models that simulate the possibilities for and constraints on various kinds of activity within specified social-material environments. The central question here, recently raised in provocative terms by Adrian Currie (2014), is whether or in what sense these strategies fill gaps in the surviving trace record and generate new evidence.

Jennifer Cuffe: The Time of Adverse Drug Reaction Databases in Canada

As technology changes, so do the everyday routines of working with computerized databases. New work routines may bring to light certain data in a collection, while eclipsing others. This paper explores the notion of ‘dark data’ in the context of everyday work routines in mandated science (broadly understood), by tracing how public servants have reasoned with one cumulative collection of data since the late 1960s.

Since 1965, Canadian government officials have maintained a collection of reports that document suspected adverse reactions to medications. This collection has been variously computerized since 1969, and, over the years, officials have designed an increasing number of automated searches and outputs for public health protection. Each iteration of automation has reshaped the routines and rhythms of work for the public servants reasoning with the data.

While public servants have sought increasing automation through this period, they have also created collective and idiosyncratic paper-based collections of the data to supplement their use of the computerized database. I describe three such paper-based collections of data, and the public servants’ rationales for maintaining these paper-based collections, as a window onto their changing perspective on how some data, when computerized, became inaccessible or ‘dark’ for the purposes of reasoning activities. I argue that in each of these cases, the paper-based collection was maintained to anchor the data in an embodied present. As such, the paper explores the role of time, sequence, and pace in the creation of knowledge linked to reasoning through databases in changing technological systems

11.00 – 11.30 Coffee/Tea break

11.30 – 13.00 Interlocutor responses and general discussion

Jim Griesemer

Sharon Traweek

13.00 – 14.00 Lunch

14.00 – 15.00 Paper session 2

Joe Dumit

Neal White

15.00 – 15.30 Coffee/Tea break

15.30 – 17.00 Interlocutor responses and general discussion

Michael Fischer

Ann Kelly

17.00 – 17.30 General Discussion

18.45 Speakers' dinner

Tuesday 16th December

9.00 – 10.30 Paper session 3

Elena Aronova: The Political Economy of Data in Geophysics during the Cold War

In this paper I will use the history of the World Data Centers – the “data archives” organized to serve the International Geophysical Year (IGY, 1957-8) – to discuss the role of secrecy in the practices of data-use in physical environmental sciences in the 1950s and early 1960s. I will argue that secrecy surrounding critical geophysical data characterizing the adversary’s territory propelled the international cooperation in geophysics at the same time turning the IGY data into the “exchange currency” in possession of two “countries-keepers” of planetary geophysical data: the US and USSR. After establishing this background I will discuss the ways in which the boundary “gray zones” of sensitive but not secretive data -- the zones that were constantly negotiated and renegotiated -- reinforced the distinction between “data,” “information,” and “knowledge.”

Linsey McGoey: The Missing Surplus: Tracking Elusive Wealth

Writing in 1979, J.K. Galbraith once claimed that “of all the classes, the wealthy are the most noticed and least studied.” Galbraith was bemoaning a long-standing problem: the fact that studying wealth and its distribution was, for many decades, neglected in the social sciences in comparison to studies of resource scarcity and poverty. On the one hand, wealth and its production have been an overarching, if not the defining preoccupation of the social sciences since the 19th century, from Marx and Weber through theorists such Veblen and Bourdieu. On the other hand, the wealthy themselves have typically remained inscrutable. Writing in the Annual Review of Sociology in 2000, Lisa Keister and Stephanie Moller stated the problem baldly: “Wealth has largely been ignored in studies of inequality” (2000: 64). Why did sociologists and economists turn their gaze from distributional questions during much of the 20th century? In this paper, I explore that question through a focus on the difficulty of accessing national and global data on wealth. I suggest that measurement difficulties are one reason (though perhaps not the primary reason) why, prior to Thomas Piketty and other notable exceptions, mainstream economists largely ignored the problem of wealth concentration from the mid-20th-century onwards. I suggest this neglect underpins a larger political and economic shift: declining attention to the question of how to govern and distribute economic surpluses in affluent societies. To ‘rediscover’ the surplus, we need to scrutinize the elusiveness of wealth data.

Emilia Sanabria: Health education and the problem of (non)knowledge – Uncertainty, complexity and ignorance in the French obesity “epidemic”

In France, three public health initiatives were launched to tackle non-communicable diseases associated with high body mass indexes. These often center on informing eaters. In this model, people are expected to understand the information they are given and to change their behavior in accord. This overlooks the dynamic aspects of learning, and the myriad factors that intercede between knowledge and its absence. The focus on knowing latent in health education is based on a presumption of ignorance which overshadows the variegated forms of not-knowing that are at play in the field of public health nutrition. As the factors associated with dietary health outcomes proliferate – from the metabolic, neuroendocrine and affective to the environmental, urban, or cognitive (all native categories) – complex models are adopted to link phenomena across time. The focus on knowing also obscures the contested and disputed nature of the knowledge that is transferred and the ways in which uncertainty may not simply be an inherent to complexity, but also produced – strategically. Analyzing the ideas circulating in expert documents and French nutrition conferences and drawing on interviews with public health nutritionists, I examine the political value of knowledge and ignorance in this context. Recent work in social theory has turned attention to the study of unknowing, revealing how gaps in knowing, concealment, ignorance or doubt-mongering may be the product of active forces. My aim is to trace the relation between attributions of ignorance to individuals on the one hand, and calls to attend to systematic and planned impairment of the decision-making capabilities of eaters by the marketing strategies of BigFood, on the other. By bringing these variegated forms of nonknowledge together I explore the deflections of responsibilities at play between ignorant individuals and a food industry highly active in downplaying its role in the obesity “epidemic.”

Workshop materials

Proceedings and recordings from the conference will be posted on this website (sponsored by the University of Chicago), as well as on the Exeter Data Studies website (sponsored by the European Research Council and the University of Exeter). Selected conference contributions also form the basis of an upcoming journal special issue.

The organisers would like to thank the University of Exeter’s HASS strategy for funding, as well as the joint ESRC/Dstl/AHRC award (ES/K011308/1) titled 'The Formulation and Non-formulation of Security Concerns' held by Brian Rappert, Brian Balmer and Sam Evans. The organisers would furthermore like to thank the Leverhulme Trust for the award (RPG-2013-153) titled 'Beyond the Digital Divide' held by Brian Rappert, Louise Bezuidenhout, Ann Kelly and Sabina Leonelli.

Login

Registered K/V5 participants will be receiving accounts shortly, to download the workshop reading materials: Please log in to make them visible.