Artificial Intelligence (AI) in the form of different machine learning models is applied to Big Data as a way to turn data into
valuable knowledge. The rhetoric is that ensuing predictions work well—with a high degree of autonomy and automation. We argue that we need to analyze the process of applying machine learning in depth and highlight at what point
human knowledge production takes place in seemingly autonomous work. This article reintroduces classification theory
as an important framework for understanding such seemingly invisible knowledge production in the machine learning
development and design processes. We suggest a framework for studying such classification closely tied to different steps
in the work process and exemplify the framework on two experiments with machine learning applied to Facebook data
from one of our labs. By doing so we demonstrate ways in which classification and potential discrimination take place in
even seemingly unsupervised and autonomous models. Moving away from concepts of non-supervision and autonomy
enable us to understand the underlying classificatory dispositifs in the work process and that this form of analysis
constitutes a first step towards governance of artificial intelligence.
Keywords: Artificial intelligence | machine learning | classification | social media| Facebook | discrimination | bias

The collection and circulation of data is now a central element of increasingly more sectors of contemporary capitalism.
This article analyses data as a form of capital that is distinct from, but has its roots in, economic capital. Data collection is
driven by the perpetual cycle of capital accumulation, which in turn drives capital to construct and rely upon a universe in
which everything is made of data. The imperative to capture all data, from all sources, by any means possible influences
many key decisions about business models, political governance, and technological development. This article argues that
many common practices of data accumulation should actually be understood in terms of data extraction, wherein data is
taken with little regard for consent and compensation. By understanding data as a form capital, we can better analyse the
meaning, practices, and implications of datafication as a political economic regime.
Keywords: Big Data | digital capitalism | value | political economy | Marx | Bourdieu

Data activism, promoting new forms of civic and political engagement, has emerged as a response to problematic aspects
of datafication that include tensions between data openness and data ownership, and asymmetries in terms of data usage
and distribution. In this article, we discuss MyData, a data activism initiative originating in Finland, which aims to shape a
more sustainable citizen-centric data economy by means of increasing individuals’ control of their personal data. Using
data gathered during long-term participant-observation in collaborative projects with data activists, we explore the
internal tensions of data activism by first outlining two different social imaginaries – technological and socio-critical –
within MyData, and then merging them to open practical and analytical space for engaging with the socio-technical
futures currently in the making. While the technological imaginary favours data infrastructures as corrective measures,
the socio-critical imaginary questions the effectiveness of technological correction. Unpacking them clarifies the kinds of
political and social alternatives that different social imaginaries ascribe to the notions underlying data activism, and
highlights the need to consider the social structures in play. The more far-reaching goal of our exercise is to provide
practical and analytical resources for critical engagement in the context of data activism. By merging technological and
socio-critical imaginaries in the work of reimagining governing structures and knowledge practices alongside infrastructural arrangements, scholars can depart from the most obvious forms of critique, influence data activism practice, and
formulate data ethics and data futures.
Keywords: Datafication | social imaginary | data activism | MyData | data ethics | socio-technical futures

This paper aims to contribute to the development of tools to support an analysis of Big Data as manifestations of social
processes and human behaviour. Such a task demands both an understanding of the epistemological challenge posed by
the Big Data phenomenon and a critical assessment of the offers and promises coming from the area of Big Data analytics.
This paper draws upon the critical social and data scientists’ view on Big Data as an epistemological challenge that stems
not only from the sheer volume of digital data but, predominantly, from the proliferation of the narrow-technological and
the positivist views on data. Adoption of the social-scientific epistemological stance presupposes that digital data was
conceptualised as manifestations of the social. In order to answer the epistemological challenge, social scientists need to
extend the repertoire of social scientific theories and conceptual frameworks that may inform the analysis of the social in
the age of Big Data. However, an ‘epistemological revolution’ discourse on Big Data may hinder the integration of the
social scientific knowledge into the Big Data analytics.
Keywords: Social and cultural Big Data analytics | social science | computational science | epistemological challenge | social media

Medical research data is sensitive personal data that needs to be protected from unauthorized access and unintentional
disclosure. In a research setting, sharing of (big) data within the scientific community is necessary in order to make
progress and maximize scientific benefits derived from valuable and costly data. At the same time, convincingly protecting
the privacy of people (patients) participating in medical research is a prerequisite for maintaining trust and willingness to
share. In this commentary, we will address this issue and the pitfalls involved in the context of the PEP project1 that
provides the infrastructure for the Personalized Parkinson’s Project,2 a large cohort study on Parkinson’s disease from
Radboud University Medical Center (Radboudumc), in cooperation with Verily life Sciences, an Alphabet subsidiary.
Keywords:Big Data | GDPR compliance | informed consent | medical cohort study | polymorphic encryption | privacy by design

Making publics visible through digital traces has recently generated interest by practitioners of public engagement and scholars
within the field of digital methods. This paper presents an experiment in moving such methods into critical proximity with
political practice and discusses how digital visualizations of topical debates become appropriated by actors and hardwired into
existing ecologies of publics and politics. Through an experiment in rendering a specific data-public visible, it shows how the
interplay between diverse conceptions of the public as well as the specific platforms and data invoked, resulted in a situated
affordance-space that allowed specific renderings take shape, while disadvantaging others. Furthermore, it argues that several
accepted tropes in the literatures of digital methods ended up being problematic guidelines in this space. Among these is the
prescription to shown heterogeneity by pushing back at established media logics.
Keywords: Digital methods | public engagement | pragmatism | controversy-mapping | critical proximity | multiplicity

Critical algorithm scholarship has demonstrated the difficulties of attributing accountability for the actions and effects of
algorithmic systems. In this commentary, we argue that we cannot stop at denouncing the lack of accountability for
algorithms and their effects but must engage the broader systems and distributed agencies that algorithmic systems exist
within; including standards, regulations, technologies, and social relations. To this end, we explore accountability in ‘‘the
Generated Detective,’’ an algorithmically generated comic. Taking up the mantle of detectives ourselves, we investigate
accountability in relation to this piece of experimental fiction. We problematize efforts to effect accountability through
transparency by undertaking a simple operation: asking for permission to re-publish a set of the algorithmically selected
and modified words and images which make the frames of the comic. Recounting this process, we demonstrate slippage
between the ‘‘complication’’ of the algorithm and the obscurity of the legal and institutional structures in which it exists.
Keywords: Algorithms | normativity | accountability | responsibility | mystery | detective

Ever since Big Data became a mot du jour across social fields, optical metaphors such as the microscope began to surface
in popular discourse to describe and qualify its epistemological impact. While the persistence of optics seems to be at
odds with the datafication of vision, this article suggests that the optical metaphor offers an opportunity to reflect about
the material consequences of the modes of seeing and knowing that currently shape datafied worlds. Drawing on feminist
new materialism, the article investigates the optical metaphor as a material-discursive practice that actively constitutes
the world, as metaphors imply modes of thinking, knowing and doing that have material enactions. Expanding visual
culture theories, the notion of ‘optical unconscious’ is taken up to discuss the tensions between displacement and
persistence of optics within datafied worlds, that is, how optical vision is displaced but also mobilised and repurposed by
data-driven knowledge. In dialogue with feminist science and technology studies and speculative ethics, I suggest that the
datafication of vision offers a chance to reconceptualize the sense of sight towards a sensorial engagement with Big Data
premised on responsibility, care, and an ethics of unknowability. Within this framework, vision may be conceived
differently, perhaps not only as enhancement and control, but as generator of new possibilities. Ultimately, the article
proposes that the visual theories after which Big Data is being imagined matter not only for our understanding of Big
Data’s epistemic potential, but also for the possibility of shaping emerging data worlds.
Keywords: Optical unconscious | datafication of vision | speculative ethics | care | feminist materialism | metaphors

This article addresses the role of application programming interfaces (APIs) for integrating data sources in the context of
smart cities and communities. On top of the built infrastructures in cities, application programming interfaces allow to
weave new kinds of seams from static and dynamic data sources into the urban fabric. Contributing to debates about
‘‘urban informatics’’ and the governance of urban information infrastructures, this article provides a technically informed
and critically grounded approach to evaluating APIs as crucial but often overlooked elements within these infrastructures.
The conceptualization of what we term City APIs is informed by three perspectives: In the first part, we review
established criticisms of proprietary social media APIs and their crucial function in current web architectures. In the
second part, we discuss how the design process of APIs defines conventions of data exchanges that also reflect negotiations between API producers and API consumers about affordances and mental models of the underlying computer
systems involved. In the third part, we present recent urban data innovation initiatives, especially CitySDK and
OrganiCity, to underline the centrality of API design and governance for new kinds of civic and commercial services
developed within and for cities. By bridging the fields of criticism, design, and implementation, we argue that City APIs as
elements of infrastructures reveal how urban renewal processes become crucial sites of socio-political contestation
between data science, technological development, urban management, and civic participation.
Keywords: Application Programming Interface (API) | infrastructure | Internet of Things (IoT) | interface design | social urban data | smart city