How can citation practices be used as a strategy to decolonize tech research, ask Data & Society Events Production Assistant Rigoberto Lara Guzmán and Director of Research Sareeta Amrute in their new zine.

“So, make it a habit to do a ‘badass feminist tech scholar of color’ scan on everything your write, every speech you are about to give, and all those emails you are about to answer. Ask yourself, for each topic you present, each yes or no you give to a request, where are the women of color? Who can I suggest who would be a better person than me to be the expert here? Who do I want to be in community with?”

In this article, Research Lead Madeleine Clare Elish investigates who bears the responsibility when an automated system fails.

“Just as the crumple zone in a car is designed to absorb the force of impact in a crash, the human in a highly complex and automated system may become simply a component—accidentally or intentionally—that bears the brunt of the moral and legal responsibilities when the overall system malfunctions.”

“However, there is an urgent need to understand the interdependencies between technology, infrastructure and health, and how these relationships affect Americans’ ability to live the healthiest lives possible. How can we support design, decision-making, and governance of our infrastructures in order to ensure more equitable health outcomes for all Americans?”

In this reading list, Data & Society Researcher Alexandra Mateescu and Postdoctoral Scholar Julia Ticona provide a pathway for deeper investigations into themes such as gender inequality and algorithmic visibility in the gig economy.

“This list is meant for readers of Beyond Disruption who want to dig more deeply into some of the key areas explored in its pages. It isn’t meant to be exhaustive, but rather give readers a jumping off point for their own investigations.”

Data & Society Research Analyst Melanie Penagos summarizes three blogposts that came as a result of Data & Society’s AI & Human Rights Workshop in April 2018.

“Following Data & Society’s AI & Human Rights Workshop in April, several participants continued to reflect on the convening and comment on the key issues that were discussed. The following is a summary of articles written by workshop attendees Bendert Zevenbergen, Elizabeth Eagen, and Aubra Anthony.”

How will the introduction of AI into the field of medicine affect the doctor-patient relationship? Data & Society Fellow Claudia Haupt identifies some legal questions we should be asking.

“I contend that AI will not entirely replace human doctors (for now) due to unresolved issues in transposing diagnostics to a non-human context, including both limits on the technical capability of existing AI and open questions regarding legal frameworks such as professional duty and informed consent.”

On April 26-27, Data & Society hosted a multidisciplinary workshop on AI and Human Rights. In this Points piece, Data + Human Rights Research Lead Mark Latonero and Research Analyst Melanie Penagos summarize discussions from the day.

After the Cambridge Analytica scandal, can internet data be used ethically for research? Data & Society Postdoctoral Scholar Kadija Ferryman and Elaine O. Nsoesie, PhD from the Institute for Health Metrics and Evaluation recommend “proceeding with caution” when it comes to internet data and precision medicine.

“Despite the public attention and backlash stemming from the Cambridge Analytica scandal — which began with an academic inquiry and resulted in at least 87 million Facebook profiles being disclosed — researchers argue that Facebook and other social media data can be used to advance knowledge, as long as these data are accessed and used in a responsible way. We argue that data from internet-based applications can be a relevant resource for precision medicine studies, provided that these data are accessed and used with care and caution.”

“How did the choices made by only 270,000 Facebook users affect millions of people? How is it possible that the estimate of those affected changed from 50 million to 87 million so quickly? As a professor of data policy, I am interested in how information flows within organizations. In the case of Facebook and Cambridge Analytica, I was curious why this number was so inexact.”

“I get that many progressive communities are panicked about conservative media, but we live in a polarized society and I worry about how people judge those they don’t understand or respect. It also seems to me that the narrow version of media literacy that I hear as the “solution” is supposed to magically solve our political divide. It won’t. More importantly, as I’m watching social media and news media get weaponized, I’m deeply concerned that the well-intended interventions I hear people propose will backfire, because I’m fairly certain that the crass versions of critical thinking already have.”

“So why should we be worried about rules that require caregivers to provide an electronic verification of the labor provided to clients? Because without careful controls and ethical design thinking, surveillance of caregiver labor is also functionally surveillance of care recipients, especially when family members are employed as caregivers.”

“In that moment, I realized that this community of Evangelical Christians were engaged in media literacy, but used a set of reading practices secular thinkers might be unfamiliar with. I’ve seen hundreds of Conservative Evangelicals apply the same critique they use for the Bible, arguably a postmodern method of unpacking a text, to mainstream media — favoring their own research on topics rather than trusting media authorities.”

As data becomes more prevalent in the health world, Data & Society Postdoctoral Scholar Kadija Ferryman urges us to consider how we will regulate its collection and usage.

“As precision medicine rushes on in the US, how can we understand where there might be tensions between fast-paced technological advancement and regulation and oversight? What regulatory problems might emerge? Are our policies and institutions ready to meet these challenges?”

“As genetic risk and other health data become more widely available, insights from research and early clinical adoption will expand the growing and data-centric field of precision medicine. However, just like previous forms of medical intervention, precision medicine aims to enhance life, decrease risk of disease, improve treatment, and though data plays a big role, the success of the field depends heavily upon clinician and patient interactions.”

Artificial intelligence is increasingly being used across multiple sectors and people often refer to its function as “magic.” In this blogpost, D&S researcher Madeleine Clare Elish points out how there’s nothing magical about AI and reminds us that the human labor involved in making AI systems work is often rendered invisible.

“From one perspective, this makes sense: Working like magic implies impressive and seamless functionality and the means by which the effect was achieved is hidden from view or even irrelevant. Yet, from another perspective, implying something works like magic focuses attention on the end result, denying an accounting of the means by which that end result was reached.”

D&S founder and president sings praise for Virginia Eubank’s new book Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor.

“This book should be mandatory for anyone who works in social services, government, or the technology sector because it forces you to really think about what algorithmic decision-making tools are doing to our public sector, and the costs that this has on the people that are supposedly being served. It’s also essential reading for taxpayers and voters who need to understand why technology is not the panacea that it’s often purported to be. Or rather, how capitalizing on the benefits of technology will require serious investment and a deep commitment to improving the quality of social services, rather than a tax cut.”

This is a transcript of Data & Society founder and president danah boyd’s recent lightening talk at The People’s Disruption: Platform Co-Ops for Global Challenges.

“But as many of you know, power corrupts. And the same geek masculinities that were once rejuvenating have spiraled out of control. Today, we’re watching as diversity becomes a wedge issue that can be used to radicalize disaffected young men in tech. The gendered nature of tech is getting ugly.”

Data & Society Researcher Alexandra Mateescu maps out the inequalities and power dynamics within the gig economy.

“As on-demand companies like Handy and online marketplaces like Care.com enter the space of domestic work, a range of questions emerge: what are the risks and challenges of signing up for platform-based work as an immigrant? As a non-native English speaker? How are experiences of work different for individuals with strong professional identities as caregivers or housekeepers, versus more casual workers who may also be finding other kinds of work via Postmates or Uber?”

D&S researcher Claire Fontaine’s debut on Points, “Doing Screen Time,” resourcefully unwinds the apparent contradiction between anxiety around screen time at home and support for screen time at school: “Each produces and enables the other.” Looking into this dynamic is an occasion for asking, collectively, how we want to live.

D&S founder danah boyd begins to sketch out how hacking culture evolved from playful efforts to game the media ecosystem to more complex and politicized projects of social engineering, propaganda, and activism in “Hacking the attention economy”.

In “Why America is Self-Segregating,” D&S founder danah boyd looks back at the unraveling of two historical institutions through which social, racial, and class-based diversification of social networks was achieved — the US military and higher education — and asks how trends towards content personalization on social media continue to fragment Americans along ideological lines.

In “Are There Limits to Online Free Speech?” D&S fellow Alice Marwick argues against simplistic binaries pitting free speech against censorship, looking at how the tech industry’s historic commitment to freedom of speech falls short in the face of organized harassment.

In “Did Media Literacy Backfire?” D&S founder danah boyd argues that the thorny problems of fake news and the spread of conspiracy theories have, in part, origins in efforts to educate people against misinformation. At the heart of the problem are deeper cultural divides that we must learn how to confront.

In “What’s Propaganda Got To Do With It?” Caroline Jack notes a resurgence in the popularity of “propaganda” as a judgment-laden label for a vast array of media ranging from fringe conspiracy theories to establishment news institutions. What work is this concept doing in efforts to conceptually navigate the contemporary media environment?

27% of all American internet users self-censor what they post online out of fear of online harassment. Young people are especially prone to self-censorship. This is deeply disturbing. We often worry about free speech online, but we don’t consider the social factors that prompt people to censor themselves or the ways in which this impacts some groups more than others.

The more that is known about the workers and the work of the on-demand economy, the stronger the call for platform builders to make systems for sustainable work: systems that acknowledge the lived conditions and external factors that affect workers.

What we need to move forward is a triangulation of methods — using a combination of qualitative and quantitative analysis and supporting a publishing process that elevates transparency and open critique.

Contingent work has always been prevalent in communities where workers have been historically excluded from secure jobs, from union membership, and even from wider public forms of social welfare through systemic forms of discrimination. For these workers, there was no “golden era” of plentiful stable work and a strong social safety net. Despite these long-standing trends, emerging forms of on-demand labor, and the data-driven technologies that workers interact with, can deepen the vulnerabilities of certain populations of workers.

D&S founder danah boyd opines on the usefulness of election polls and calls for the cessation of these polls.

It’s now time for the media to put a moratorium on reporting on election polls and fancy visualizations of statistical data. And for data scientists and pollsters to stop feeding the media hype cycle with statistics that they know have flaws or will be misinterpreted as fact.

Algorithms not only model, they also create. For researchers grappling with the ethics of data analytics, these feedback loops are the most challenging to our familiar tools for science and technology ethics.

Complex models with high stakes require rigorous periodic taste tests. Unfortunately most organizations using big data analytics have no mechanism for feedback because the models are used in secrecy.

Producing predictions, like making sausage, is currently an obscure practice. If botulism spreads, someone should be able to identify the supply chain that produced it. Since math is the factory that produces the sausage that is data science, some form of reasoning should be leveraged to communicate the logic behind predictions.

O’Neil’s analysis doesn’t just apply to mathematical models; it applies to societal models. Most of the WMDs that Cathy O’Neil describes are inextricably linked to unjust social structures.

We all, data scientists included, need to act with some humility and reflect on the nature of our social ills. As O’Neil writes, “Sometimes the job of a data scientists is to know when you don’t know enough” (216). Those familiar with Greek moral philosophy know that this type of Socratic wisdom can be very fruitful.

It’s not just the dark side of Big Data she shows us, but shady business practices and unjust social regimes. We will never disarm the WMDs without addressing the social injustice they mask and perpetuate. O’Neil deserves credit for shining a bright light on this fact.

There are a few minor mischaracterizations and omissions in this chapter of Weapons of Math Destruction that I would have liked O’Neil to address. CompStat is not, as she suggests, a program like PredPol’s. This is a common misconception; CompStat is a set of organizational and management practices, some of which use data and software. In the section on stop-and-frisk, the book implies that a frisk always accompanies a stop, which is not the case; in New York, only about 60% of stops included a frisk. Moreover, the notion of “probable cause” is conflated with “reasonable suspicion,” which are two distinct legal standards. In the section on recidivism, O’Neil asks of prisoners,

“is it possible that their time in prison has an effect on their behavior once they step out? […] prison systems, which are awash in data, do not carry out this highly important research.”

Although prison systems may not conduct this research, there have been numerous academic studies that generally indicate a criminogenic effect of harsh incarceration conditions. Still, “Civilian Casualties” is a thought-provoking exploration of modern policing, courts, and incarceration. By highlighting the scale and opacity of WMDs in this context, as well as their vast potential for harm, O’Neil has written a valuable primer for anyone interested in understanding and fixing our broken criminal justice system.

These checklist items for socio-technical design are all important for policy as well. Yet the book makes it clear that not all “sins” can be reduced to checklist form. The book also explicates other issues that cannot easily be foreseen and are almost impossible for implementers to see in advance, even if well-intentioned. One example from the book is college rankings, where the attempt to be data-driven slowly created an ecology where universities and colleges paid more attention to the specific criteria used in the algorithm. In other situations, systems will be profit-generating in themselves, and therefore implemented, but suboptimal or societally harmful — this is especially true, as the book nicely points out, for systems that operate over time, as happened with mortgage pools. Efficiency may not be the only societal goal — there is also fairness, accountability, and justice. One of the strengths of the book is to point this out and make it quite clear.

One of the most striking findings of my research so far is that there is often a major gap between what the top administrations of criminal courts say about risk scores and the ways in which judges, prosecutors, and court officers actually use them. When asked about risk scores, higher-ups often praise them unequivocally. For them, algorithmic techniques bear the promise of more objective sentencing decisions. They count on the instruments to help them empty their jails, reduce racial discrimination, and reduce expenses. They can’t get enough of them: most courts now rely on as many as four, five, or six different risk-assessment tools.

Yet it is unclear whether these risk scores always have the meaningful effect on criminal proceedings that their designers intended. During my observations, I realized that risk scores were often ignored. The scores were printed out and added to the heavy paper files about defendants, but prosecutors, attorneys, and judges never discussed them. The scores were not part of the plea bargaining and negotiation process. In fact, most of judges and prosecutors told me that they did not trust the risk scores at all. Why should they follow the recommendations of a model built by a for-profit company that they knew nothing about, using data they didn’t control? They didn’t see the point. For better or worse, they trusted their own expertise and experience instead.

Reports like “The Perpetual Line-Up” force a fundamental question: What do we want technologies like facial recognition to do? Do we want them to automate narrowly “unbiased” facets of the criminal justice system? Or do we want to end the criminal justice system’s historical role as an engine of social injustice? We can’t have both.

D&S fellow Zara Rahman introduces her upcoming research at Data & Society, where she will examine the work of translators in technology projects.

Whatever it’s called, it’s also under-appreciated. In our tech-focused world, we often hold those with so-called “hard” programming skills up on a pedestal, and we relegate those with “soft” communication skills to being invisible caretakers. It’s not an accident that this binary correlates strongly with traditionally male-dominated roles of programming and largely female-dominated roles of community management or emotional labour. It’s worth noting too, that one is paid much more than the other.

D&S researcher Mary Madden reviews and adds her perspective to the play Privacy.

As someone who has experienced the glaze of overwhelm in the eyes of my family and friends when I try to explain why privacy still matters, I have a profound respect for anyone who can articulate that message clearly.

One thing that writers of survey research questions and writers of dramatic scripts have in common is the challenge of accurately and clearly describing people’s engagement with technology.

The FBI recently announced its plan to request that their massive biometrics database, called the Next Generation Identification (NGI) system, be exempted from basic requirements under the Privacy Act. These exemptions would prevent individuals from finding out if they are included within the database, whether their profile is being shared with other government entities, and whether their profile is accurate or contains false information.Forty-four organizations, including Data & Society, sent a letter to the Department of Justice asking for a 30-day extension to review the proposal.

Code is key to civic life, but we need to start looking under the hood and thinking about the externalities of our coding practices, especially as we’re building code as fast as possible with few checks and balances.

Points:“Be Careful What You Code For” is danah boyd’s talk from Personal Democracy Forum 2016 (June 9, 2016); her remarks have been modified for Points. danah exhorts us to mind the externalities of code and proposes audits as a way to reckon with the effects of code in high stakes areas like policing. Video is available here.

What the journalists from SourceFed may have stumbled upon was not an instance in which search results were intentionally being manipulated in favor of a candidate, but how algorithms can reflect complex jurisdictional issues and international policies that can, in turn, govern content.

Points/spheres: D&S researcher Robyn Caplan asks:What drives Google’s policy of “not show[ing] a predicted query that is offensive or disparaging when displayed in conjunction with a person’s name?”

In this Points piece “Real Life Harms of Student Data,” D&S researcher Mikaela Pitcan argues that assessing real harms connected with student data forces us to acknowledge the mundane, human causes. And she asks: “What do we do now?”

“Overall, cases where student data has led to harms aren’t about data per se, but about the way that people interact with the data…

…accidental data leaks, data being hacked, data being lost, school officials using off-campus information for discipline, oversight in planning for data handling when companies are sold, and faulty data and data systems resulting in negative outcomes. Let’s break it down.”

This issue goes far beyond the Trending box in the corner of your Facebook profile, and this latest wave of concerns is only the tip of the iceberg around how powerful actors can affect or shape political discourse. What is of concern right now is not that human beings are playing a role in shaping the news — they always have — it is the veneer of objectivity provided by Facebook’s interface, the claims of neutrality enabled by the integration of algorithmic processes, and the assumption that what is prioritized reflects only the interests and actions of the users (the “public sphere”) and not those of Facebook, advertisers, or other powerful entities.

Could policies exist to guide what algorithms find relevant? Algorithms depend on data to make recommendations. Could certain cultural metadata be embedded to ensure Canadian content has other pathways of discovery beyond a platform’s usual recommendations? Or could the Canadian industry agree to relevance standards that would help them qualify influence just like Upworthy and NPR have developed new ways to identify stories that matter to their readers.

While we debate Canadian content algorithms, we also have to take seriously the much broader accountability issues raised by algorithms. It’s not just a matter of whether algorithms should promote Canadian content, but how we can understand this black box of new media power in the first place.

Points/public spheres: In “No More Magic Algorithms,” Fenwick McKelvey unpacks the impetus for and challenges of the Discovery Summit, co-hosted by the Canadian Radio-television and Telecommunications Commission and the National Film Board of Canada: What could an algorithmic cultural policy look like? Fenwick’s piece was also prompted by the Who Controls the Public Sphere in the Era of Algorithms? workshop, which Data & Society hosted in February 2016 as part of our developing Algorithms and Publics project.

Over the last two years, Data & Society has been convening a Council on Big Data, Ethics, and Society where we’ve had intense discussions about how to situate ethics in the practice of data science. We talked about the importance of education and the need for ethical thinking as a cornerstone of computational thinking. We talked about the practices of ethical oversight in research, deeply examining the role of IRBs and the different oversight mechanisms that can and do operate in industrial research. Our mandate was to think about research, but, as I listened to our debates and discussions, I couldn’t help but think about the messiness of ethical thinking in complex organizations and technical systems more generally.

Points: In this Points original, “Where Do We Find Ethics?” danah boyd takes us back to 1986 in order to pose the question of where we locate ethics in complex systems with distributed decision-making. Stay tuned for the forthcoming whitepaper from the Council on Big Data, Ethics, and Society.

Student data can and has served as an equalizer, but it also has the potential to perpetuate discriminatory practices. In order to leverage student data to move toward equity in education, researchers, parents, and educators must be aware of the ways in which data serves to equalize as well as disenfranchise. Common discourse surrounding data as an equalizer can fall along a spectrum of “yes, it’s the fix” or “this will never work.” Reality is more complicated than that.

Points: “Does data-driven learning improve equity?” That depends, says Mikaela Pitcan in this Points original. Starting assumptions, actual data use practices, interpretation, context context context — all complicate the story around education data and must be kept in mind if equity is our objective.

The processes editors have used to filter information were never transparent, hence the enthusiasm of the early 2000s for unfiltered media. What may be new is the pervasiveness of the gatekeeping that algorithms make possible, the invisibility of that filtering and the difficulty of choosing which filters you want shaping your conversation.

D&S fellow Mimi Onuoha thinks through the implications of the moment of data collection and offers a compact set of reminders for those who work with and think about data.

The conceptual, practical, and ethical issues surrounding “big data” and data in general begin at the very moment of data collection. Particularly when the data concern people, not enough attention is paid to the realities entangled within that significant moment and spreading out from it.

The point of data collection is a unique site for unpacking change, abuse, unfairness, bias, and potential. We can’t talk about responsible data without talking about the moment when data becomes data.

D&S founder danah boyd declares: “It’s not Cyberspace anymore.” While at Davos, danah reflects on the 20 years since John Perry Barlow wrote “A Declaration of the Independence of Cyberspace”.

There is a power shift underway and much of the tech sector is ill-equipped to understand its own actions and practices as part of the elite, the powerful. Worse, a collection of unicorns who see themselves as underdogs in a world where instability and inequality are rampant fail to realize that they have a moral responsibility.