Bucknell University Lewisburg, PA, US

Florence Chee,

Loyola University Chicago, US

Bettina Berendt,

KU Leuven, NL

Geoffrey Rockwell

University of Alberta Edmonton, AB, CA

About Geoffrey

Dr. Geoffrey Martin Rockwell is a Professor of Philosophy and Humanities Computing at the University of Alberta, Canada. He has published and presented papers in the area of philosophical dialogue, textual visualization and analysis, humanities computing, instructional technology, computer games and multimedia including a book with S. Sinclair, _Hermeneutica: Computer Assisted Interpretation in the Humanities_ (MIT Press: 2016) . He currently teaches in the Humanities Computing MA programme at the University of Alberta and is the Director of the Kule Institute for Advanced Study.

Abstract

This article examines key ethical issues that are continuing to emerge from the task of archiving data scraped from online sources such as social media sites, blogs, and forums, particularly pertaining to online harassment and hostile groups. Given the proliferation of digital social data, an understanding of ethics and data stewardship that evolves alongside the shifting landscape of digital societies is indeed essential.

Our study involves a primary research archive that is comprised of data scraped from our project concerning the case study of Gamergate, which involved numerous instances of hate speech in various online communities. Doing this type of qualitative research presents advantages for humanities and social science research because it is possible to generate large and rich corpora about subjects of human interest. However, such data scraping has also raised ethical issues around treating social media authors as research subjects and, moreover, as subjects who have provided informed consent. Once researchers consider content creators on these sites as human research subjects, what would best efforts adhering to the directive to “do no harm” look like?

While we realize the impossibility for definite rules to exist, we do consider the possibilities for how one can best care for the stakeholders using the challenges in their particular contexts. In this case, the stakeholders included Twitter authors, targets of online harassment, researchers, students, archivists, and the larger academic community. Also under consideration is how the Ethics of Care may be extended to the research community, and especially student researchers in their exposure to toxic material.

Introduction

How should humanities researchers approach the problem of research ethics when they are engaged in analyzing contemporary cultural artifacts developed on the internet? As researchers of web-based cultural artifacts, we must analyze our ethical relationships in concert and in dialogue with our subjects much more than we may have done in the past. In this paper, we use the collection of data objects such as tweets from Twitter and archived materials from websites to show how the examination, collection, and curation of objects pertaining to a controversial cultural phenomenon like Gamergate requires a deeper examination of the ethical guidelines that shape and drive our research.

One of the purposes of this paper is to describe a research project as it engages in ethical reflection throughout the project lifecycle. Our initial research project, Gamergate Reactions (Rockwell and Suomela 2015), examined the Gamergate online conflict from its emergence in 2014. The Gamergate controversy erupted during the summer of 2014 and quickly engulfed the online gaming community in an intense debate about what and who belonged in gaming culture. The conflict quickly escalated into another battlefront in the culture wars involving gender, identity, and political beliefs (Hathaway 2014; Wagner 2014). Overt harassment of women and others who were critical of aspects of gaming culture quickly became a dominant feature of online forums, Twitter, and other media in which discussion about gaming culture occurred. Rape and death threats were sent to prominent game developers and journalists, some of whom were also doxxed (had personally identifiable information such as home addresses disseminated online). This harassment and other extreme expressions of hatred and intolerance quickly eclipsed the alleged ethical issues in gaming culture that supposedly was the impetus for Gamergate.

The focus of this paper is the ethical challenges encountered while working on the difficult topic of Gamergate. We do not present an analysis of the Gamergate phenomenon in this paper, but we do believe that our project’s focus on Gamergate-related web content provides a challenging example of dealing with ethics in the (digital) humanities. The artifacts of the digital age, such as web archives and Twitter streams, are closer to the data sets dealt with by social scientists than to the painstakingly preserved codices of libraries, traditional archives, or the bound volumes of a magazine. As such, digital humanists need to engage with the discourse on ethics arising from other disciplines. The humanities have not developed ethical guidelines around contemporary content that is publicly accessible on social media and online fora as they have with archives in libraries and museums. The lessons learned from our experience can, and should, be applied to other digital humanities projects which engage with the digital materials of contemporary culture.

There are multiple reasons why such an examination of the ethical practices in the digital humanities is particularly important for research focused on contemporary subjects. One reason is institutionally-based. Humanists have traditionally been free from many of the bureaucratic burdens of complying with Institutional Review Board/Ethic Review Board (IRB/REB) regulations, but there is no guarantee that this condition will last. As the digital humanities become increasingly interdisciplinary, there are and will be more interactions with social scientists who are already well-versed in the institutional protocols related to human subject research. Research ethicists and philosophers have developed sophisticated arguments and guidelines for dealing with research that involves human subjects. Digital humanists can become better collaborators by becoming familiar with the ethical discourses of the fields that commonly undertake human subject research. A familiarity with the key concepts of research ethics is valuable for digital humanists who may be asked to submit an IRB proposal before beginning to work on an internet-based research project.

Another reason is the larger consequences of living in a digital age. The conversation about ethics in digital research relates to larger issues concerning ethics and online activities which are currently being debated in the public sphere (Rockwell and Berendt 2017). The discussion within digital humanities (Klein 2015; Rehbein 2015; Presner 2015) already intersects with many emergent issues in the digital era, such as surveillance, big data, fake news, and algorithmic bias. We believe that further reflection on these questions may aid humanists in setting an example of thoughtful engagement with these converging topics.

As the evidence of increasing corporate and government surveillance grows, the ethics of work in the digital humanities is not exempt from enhanced scrutiny and questions of power relations. These questions include: Under what circumstances is it ethical to gather and archive any and all data about people, living or dead? From an academic standpoint, how is our scholarly work not a form of surveillance? In this paper, we discuss the ethics of datafication, by which we mean the whole process of gathering, enriching, analyzing and then archiving data (Rockwell and Berendt 2017), in the context of a particular project, Gamergate Reactions (Rockwell and Suomela 2015).

The first half of the current paper describes some of the common approaches to online research ethics and the philosophy behind those approaches. We introduce the concept of an Ethics of Care (Held 2006) as an alternative way of framing discussions of research ethics within digital humanities. The second half of the paper presents a case study of how the ideas of an Ethics of Care were applied in the Gamergate Reactions project (Rockwell and Suomela 2015). The goal is to answer two research questions: What is an appropriate ethical framework for online research conducted within the digital humanities and how can that framework be put into practice?

Ethics of Care and Research Ethics

In this section, we will situate Ethics of Care in the wider context of Research Ethics. In the context and given the space limitations of the current paper, it is of course impossible to give an exhaustive introduction to these wide fields. However, we consider it necessary to give an overview, for three reasons. First, we want to engage with research ethics debates from other fields; second, digital humanists may not be familiar with the terms of the debate; and third, learning about these debates may make humanities perspectives more relevant to datafication debates.

One approach to ethical judgment is deontological, or the study of ethical duty. An example of such a duty would be Kant’s Categorical Imperative that we should act only on an ethical principle that we would accept becoming universal (Alexander and Moore 2016). In other words, that we should do only what we would have everyone do. Are there certain duties which we should follow when dealing with data collected from the Internet? One potential way of answering this question is to look at research ethics in general. Many of the guidelines for research ethics proposed in various fields use a form of deontological reasoning. One of the landmark documents in the modern development of research ethics is the Belmont Report (National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research 1979). The core principles of the report described three fundamental ethical concerns for human subject research: respect for persons, beneficence, and justice. In a medical context, respect for persons has been interpreted as requiring informed consent, ensuring that patients are not coerced, and that they understand the extent of the study in which they are participating. Beneficence requires the maximizing of benefits and the minimizing of risk for the research subject, as well as guaranteeing that the experimenter will ‘do no harm’. Justice ensures that research subjects will not be exploited, and that the study is administered fairly and equally. The institutional support for these principles at most research institutions in the United States are the Institutional Review Boards, which follow the ‘Common Rule’ established by the 1991 Federal Policy for the Protection of Human Subjects (Department of Health and Human Services 2009). Canada and many other countries have established similar institutional protections for research subjects through Research Ethics Boards and governmental guidelines (TC3+ 2014).

A second approach to ethical judgment is utilitarian (Sinnott-Armstrong 2015). The key question for a utilitarian analysis is: who benefits and how much? A utilitarian approach to the ethical questions of internet research might focus on the consequences of such research and attempt to balance the knowledge that would be gained against its potential for harm. Medical and other human subject research boards often ask researchers to assess the potential for harm under the standard of beneficence mentioned above. The difficulty with this approach to humanities or cultural research on the internet is that there is no easy way to assess potential harm. The biomedical approach to research ethics, which generally assumes data from clinical trials and laboratories, does not adequately fit when addressing the possible nuances of collecting data from social media such as Twitter. The charge to ‘do no harm’ takes numerous and varied forms when differentiating between drugs and status updates, for example.

A third approach to ethical judgment is virtue ethics. Virtue ethics is based on the assessment of how human actions foster the virtues of human character (Hursthouse and Pettigrove 2016). It is harder to draw a direct link between virtue ethics and research ethics because most of the discussion within research ethics focuses heavily on the principles of respect for persons, beneficence, and justice, and less on the virtues of researchers or subjects. But we can detect an implicit argument for a type of virtuous behavior on the part of researchers. The injunction of the Hippocratic oath describes a virtuous ideal for behaviour in the medical profession. Harder to define, but just as important, is the general idea that a good researcher should not commit fraud of any kind.

Research ethics as a broad discourse can be seen to draw on many different forms of ethical reasoning, including portions of the deontological, utilitarian, and virtue traditions of ethical argument. For example, in order to assist researchers in making ethical decisions about internet research, the Association of Internet Researchers has published two sets of recommendations that were written in consultation with the collective knowledge and experience of its international membership: one in 2002 (Ess and Association of Internet Researchers 2002) and another in 2012 (Markham and Buchanan 2012). The guidelines were informed by a broad range of challenges researchers were facing through their research projects. This included how context is defined and conceptualized, ethical expectations for researchers and subjects, along with privacy, and care beyond the period of a project’s lifespan. Research on informed consent issues by Chee, Taylor, and de Castell (2012) helped to structure the guidelines as a precursor to our discussion on the Ethics of Care.

Moving toward the Ethics of Care

The Ethics of Care1 is a theory that developed from feminist thought about the moral importance of the experiences of caring for children, the elderly and vulnerable populations. Others have used it in the context of design and in the digital humanities to draw attention to carework (Klein 2015; Jackson 2014). Unlike previous ethical theories that start from the position of an independent rational subject thinking about how to treat other equally independent rational subjects, the Ethics of Care starts with the real experience of being embedded in relationships with uneven power relations.

The Ethics of Care framework prioritizes research relationships and focuses attention particularly on the features of ethical decision making which are relevant to project design. What are the relationships between the people involved in a project? Who possesses the power or authority in a given situation? The Ethics of Care recognizes the differences in vulnerability and need among stakeholders between those studying and those being studied. Using our case study, we will outline some of the complicated relationships between researchers, subjects, communities, and discourses, which need to be continually revisited.

The crux of the difference between Ethics of Care and Justice-oriented approaches to ethics is the issue of rights. Held (2006) argues that care is a precondition for rights – the “social cohesion that makes it possible for political institutions to exist” such that there could be protection of rights. Civil society is built on civic virtues that guide how we interact outside of formal legal ways. Care would be one of the most important civic virtues.

If one agrees with Held, then the Ethics of Care emphasis on relationships is not in opposition to rights but is a matter of attention or emphasis. An Ethics of Care guides us in all the everyday situations where rights don’t apply or aren’t clear. Alternatively, one could argue that respect for rights is a type of relationship. We are in relationships with our research subjects. Considering their rights is part of thinking about the relationship.

Closer to what we are doing, Lauren Klein in “The Carework and Codework of the Digital Humanities” (2015) asks how codework (or work using digital tools and code) can help recover the carework that often gets hidden by the grand gestures of scholarship. She references Blackwood (2014) who reminds us how gendered certain research roles are, like editing and archiving. For that matter, we would argue that codework is also carework that is gendered, though differently. And ethics also has a history of being gendered. Other researchers in digital humanities have engaged in similar efforts to further the discussion of research ethics. Malte Rehbein (2015) presented a paper in 2015 at the Digital Humanities Summer Institute about the dual use problem where resources we develop can be used unethically. Todd Presner (2015) has also written about the “The Ethics of the Algorithm” and databases.

The research community often treats the work of gathering, editing and maintaining data as of secondary importance compared to grant getting, grand theorization or original contributions. The creation of, maintenance of, and enrichment of archives have traditionally been the work of assistants, librarians, and archivists. It is only recently that attention has been turned toward how the stewardship of archives, whether digital or not, is valuable. Steven Jackson (2014), for example, talks about the importance of repair of the infrastructure as being more important than building new infrastructure. The three research councils of Canada and the research infrastructure foundation (TC3+) issued a ‘consultation document’ in 2013 calling for “establishing a culture of stewardship” (TC3+ 2013). The Stevens Institute of Technology ran a conference on ‘The Maintainers’ in 2016 and 2017 (“The Maintainers” n.d.), a response to Walter Isaacson’s book, The Innovators. These are just examples of a clear turn to looking at and valuing the care and repair of our cultural and research infrastructure.

Putting the Ethics of Care into Practice: A Case Study

Our choice to use an Ethics of Care to guide our research process arose from several considerations. Our study of Gamergate and our relationships with other researchers pointed to a need to engage with feminist discourses on both the topical and methodological levels. We felt a distinct need to focus on the relationships involved internally and externally to the Gamergate Reactions (Rockwell and Suomela 2015) project itself. Many of the mainstream approaches to research ethics prioritize the discourse of rights and justice, a deontological approach that is problematized by an Ethics of Care. We also wanted to expand the ethical discussion to include people, such as librarians and archivists, engaged in the carework necessary for development of digital research materials. Finally, an Ethics of Care foregrounded the iterative and dialogic approaches we took to develop an ethically engaged research project (Klein 2015; Jackson 2014).

The research team that is the focus of this case study was composed of a diverse group of students, postdoctoral scholars, and faculty. Relationships between people involved in the research process needed to be consciously and consistently managed throughout the project. Membership changed over time as different people moved in and out of the group because of graduation, promotion, changes in enthusiasm, or acceptance of other jobs. The main research group met regularly to discuss the progress of the project and to manage tasks. During the academic term these meetings occurred weekly, at other times of the year the meetings were less frequent. The typical meeting discussed the immediate tasks that needed to be completed to gather and analyze data, and to write about the project for publication or conference presentations.

We decided that we should be guided by the Ethics of Care after much discussion with members of the research team and our colleagues. The idea for using an Ethics of Care to describe and frame the ethical challenges for the project was suggested by a member of ReFig,2 another research project with which we are affiliated (some of the authors of the present paper have been involved with ReFig for several years). The Ethics of Care struck us as an appropriate ethical framework for the context, especially given the sustained harassment of researchers studying Gamergate (Chess and Shaw 2015). Indeed, the more we learn about the Ethics of Care (Held 2006; Wittkower 2016) and its relevant applications, the more it seems suited to such complex situations where people have been harassed and ethics itself is at stake. In this case study we discuss three features of the Ethics of Care and how we applied them:

Ethics is more about relationships than about rights.

A fundamental facilitator of relationships for humanities research is dialogue.

Caring is a practice, not a heroic gesture, nor a set of rules for behavior; it is the ongoing activity of being sensitive to others (and oneself).

The next three sections describe the internal and external dialogues we engaged in during our project. We wrestled with many different issues throughout, including how to deal with identification of research team members, toxic data, the potential impacts of using social media content on those who generate it, and archiving research data. Our approach to these issues were informed by our conscious and iterative application of the Ethics of Care.

Identifying Team Members

The focus of most research ethics discussions is usually on the relationship between the researcher and subject(s). The internal dynamics of research groups is an issue that is less often discussed. For our project the internal ethics of the research group were a major concern from the beginning.

From the start of the project, we were aware that other researchers were being threatened and targeted for harassment, particularly if they were women and expressing views about gender and games in the context of Gamergate. We wanted to be proactive about mitigating the potential controversy and harassment that might result for our own research team from our research. We took the following steps to address this risk. First, meetings were arranged with campus staff to discuss the safety and technical security precautions that team members should take to protect themselves online or to respond to harassment. Second, a background literature review was conducted to discover the guidelines used by other researchers (Markham and Buchanan 2012; Ess and Association of Internet Researchers 2002; American Anthropological Association 2012) when dealing with potential harassment. Third, we consulted with key people who had already experienced harassment through these networks for their counsel on best practices for how to proceed with optimal attention to care.

One of the key insights from the Ethics of Care which we applied to our own internal discussions about the project was the recognition that members of the research team were vulnerable for different reasons. Senior researchers and tenured faculty members were typically already protected by their status within the university and the wider research community. But other researchers, such as students, postdocs, and untenured faculty in their various capacities and identities are often in more vulnerable positions. The harassment of student researchers could do significant harm to them. For example, the data sets published through the online repository at the University of Alberta listed a subset of people involved in the project and provided a group email account for public comment instead of individual email account names that could be singled out. Given the potential negative publicity and/or doxxing risks of which those involved with the project became increasingly aware, we consulted with the students and gave them the option of limiting the number of times their names were published with the artifacts of the research, such as on papers or data sets. This action, taken by the principal investigators to consider the standpoint of student workers, was guided by the Ethics of Care in how power over the distribution of one’s name rested ultimately with the owner.

It is not lost on our research team that there are absolutely problematics and power differentials associated with the choice to redact one’s own name and involvement. Given the disproportionate potential for women to sustain harassment over men, our isolated action could not ameliorate the social tendency for women to pay the gender tax. Rather, we need to state here that this whole generation of scholarship that has been coloured by the Gamergate scandal replicates the dynamics where women have worked covertly behind the scenes (sometimes through necessity), and that should be recognized by institutional bodies that consider authorship credit and promotion. While we do not have the answers when it comes to this systemic and endemic issue, we do consider this situation and discussion as one we wish to improve through increased awareness.

Toxic Data

In addition to protecting the members of the research team from potential external harms, there was a concern about the psychological impact of reviewing the digital artifacts of online harassment and hate speech. Some of the data we collected originated from websites (4chan and 8chan) that have a reputation for producing pornographic and offensive (even illegal) content on a regular basis. Both sites are transient forums in which postings are automatically deleted after a period of time. Some of the campaigns, meme development, and organizing for Gamergate was conducted on these forums. The majority of postings are anonymous and these postings often contain offensive material. We collected data from these sites for a few months, but then decided to delete that information unused because of the negative impact that analysis of this content would likely have on the researchers, especially students, in the project group. We also concluded that the detriments of archiving toxic data for possible later analysis did not outweigh the potential research benefits.

All research involves selection of topics, material, methods, and data. There is no way for any project to consider all of the possible influences or effects that a particular decision about data collection or retention may result in. Analyzing large-scale events in social media compounds the problem because the collection of data is dependent on factors beyond the control of the researcher, such as the capabilities of the API (Application Programming Interface) for gathering data. The policies of social media companies may also affect the types and completeness of the data collected. The boundary between data collection and analysis is also complicated. An initial analysis may find that more data is needed or that a conclusion is justified based on the data already collected. The decision whether to include data in an analysis is fraught with potential errors and caveats. Works of interpretation, such as in the digital humanities, are even harder to assess. How much evidence do we need to reach the conclusion that many of the participants in the Gamergate controversy were motivated by anti-feminism or misogyny, or that some of them actively engaged in harassment of prominent women gamers? (Chituc 2015).

We initially collected data from 4chan because we knew that 4chan was a major communication channel for Gamergate. But we also knew, based on our own experiences with the site and media reports (Hess 2017; Lopez 2018; Nagle 2017), that 4chan was home to pornography, harassment, doxxing, and other noxious behavior. When it came time to decide whether to analyze the data we had collected from 4chan the research team faced a complicated decision with no obvious answer and few, if any, clear rules for appropriate behaviour. Some researchers may assume that any data collected must be analyzed, but that runs opposite to the Ethics of Care approach we were trying to honour throughout the research process. The Ethics of Care approach underlines the contextual nature of ethical decision making and the relationships in which we are all embedded (Held 2006). Our research group was composed of graduate students, post-docs, and professors. The bulk of the analysis of data was performed by the graduate students and post-docs involved in the project. Was it fair to ask students to review potentially offensive material in order to make a marginally more forceful case that Gamergate contained misogynists or rape threats? We already had evidence for those behaviours in the social media data and the web pages that had been collected from the rest of the web. The small size of the 4chan data we had collected in comparison to the large numbers of Tweets in our data led us to the conclusion that additional analysis of the 4chan material would probably not change or challenge our interpretations significantly. Given the unlikeliness of discovering new information and the likely encounter with offensive material, the research group decided that analysis of 4chan data was not worth the effort.

We also decided to delete the 4chan data because of another concern that was raised by our colleagues in ReFig and at conferences: what was the future impact of preserving the evidence of online harassment? Again, we felt there was no clear standard for making this judgment based on our knowledge of research ethics. The presence of potential or future harms is hard to negotiate in any ethical decision-making process. One might say that trying to provide guidance based on potential outcomes is the core of ethical argument. If something has already happened, then we only need to deal with the consequences. Many discussions of research ethics try to minimize potential harm through such actions as informed consent and transparency. Although there may be no clear standard to which we can appeal to justify our decision to delete data, we believe that transparency can provide an explanation for why we made our decision. There may always be some who find our decision unreasonable or invalid. To them we can only say that this article attempts to describe our decision, the reasons why we reached that decision, and how those decisions might help others make better choices in the future.

Finding a balance between the potential for increasing our understanding of online behaviour versus doing psychological damage to ourselves is an ongoing dilemma. But we could not know the risk until the data was actually reviewed. In the case of some of the data we collected, we decided that the research benefits did not outweigh the potential damage. Other researchers may have acted differently but we can only report the results of our own analysis and reflection upon the methods which produced that analysis.

Using Social Media Content: Impacts on Content Creators

The relation between researchers and research subjects can be especially challenging for humanists and social scientists to navigate. Humanists, in particular, are not used to thinking ethically about research subjects because we mostly deal with either subjects who are dead or subjects that are public figures like authors, politicians or other humanists, whose roles and activities are open to scrutiny and debate. The harms, real or potential, inflicted by research methods on these subjects are often intangible and hard to measure. Social media presents an ethical challenge in research because of the complicated division between public and private activity (Zimmer 2010): it is not obvious, for example, that users of Twitter understand the potential for their tweets to be collected as part of a research project (Fiesler and Proferes 2018). The potential harms for the misuse of social media content might include threats to marginalized groups, fraudulent manipulation of the data, and reputational harms (Jules, Summers, and Mitchell 2018).

Given the complicated nature of social media research we felt it was prudent for us to take future potential harms from our data seriously. Our first duty as researchers should be to avoid harm. This is one of the main conclusions of the Belmont report and many other writers on research ethics (Buchanan and Zimmer 2016; National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research 1979; Department of Health and Human Services 2009). At first glance this appeal to duty may appear to conflict with the Ethics of Care approach we have been describing in this paper. The mention of duty immediately brings to mind the discussion of deontological ethics mentioned earlier in this paper. However, we do not think that our invocation of duty contradicts our promotion of an Ethics of Care. The Ethics of Care is based on relationships (Held 2006), with our fellow research members, the research community, and our subjects. One of our goals is to maintain those relationships over time and avoiding harm is one way of accomplishing this goal. It is possible to justify avoiding harm to research subjects from multiple philosophical perspectives.

Privacy and reputation were the two biggest harms to research subjects that we discussed and attempted to mitigate in the Gamergate project. The privacy of our subjects was protected in two overlapping ways. First, the results of the research were reported only in aggregate forms, and second the sources for direct quotes, were not identified (A. Budac 2016; R. Budac 2016; Gouglas et al. 2016; Kuznetsova and Suomela 2016; Rehbein 2015; Wilson et al. 2016). Neither method of privacy protection can completely guarantee that people will not be identified because the activity we collected and analyzed occurred in online forums like Twitter that anyone can search. A determined person could still recover the original source of a quote by searching Twitter or the internet, so the results could not be completely anonymized. The question of privacy is highly fraught when it comes to research about topics such as Gamergate, which depend on the internet as the primary medium for communication. Any quote from a publicly accessible web site could potentially be re-identified after a research study has been completed. Furthermore, the ethical boundaries for understanding research on ‘public’ web material are still openly being discussed and have not reached any consensus (Buchanan and Zimmer 2016).

Reputation harms to research subjects were the other major risk discussed during the Gamergate project and are even harder to quantify and evaluate. Throughout the project, we discussed the possibility that someone who tweeted in support of Gamergate today might, at a later point in their lives, change their mind and be embarrassed by their prior views. Future harms could extend beyond mere embarrassment to include losing a job, to emotional damage, or to self-harm. There have been media reports about how online behaviour might be a detriment to future reputations online (Weber 2014). We believe that the potential damage from our research revealing the identity of Gamergate supporters is unlikely to be so drastic, but this is only our current perception. The Ethics of Care suggests that this position may be modified as relationships change and the research process progresses.

Publication and Preservation of Data

Direct dialogue with a community that has targeted academics for harassment in the past is very difficult (Chess and Shaw 2015). After the initial public presentation of our early findings at the Canadian Game Studies Association conference in 2015, we posted a subset of our data to the University of Alberta data repository. We did this because we believe in openness as a general principle and in attempting to develop dialogue with people who may have views that are different from our own. Publicly posting portions of our dataset is one way of showing our work and encouraging others to examine the data and make their own interpretations. One online comment on our data was made by a Gamergate supporter who argued that the tweets we posted proved the opposite of what we had observed, although the Reddit thread was eventually deleted (“[People] The Person behind the Idea for #Deatheaters on Being Asked If Her PhD Is in GamerGate ‘Literally Yes. That Is What My PhD Is in.’ • r/KotakuInAction.” 2018.). The Ethics of Care does not provide general guidelines for engaging with research subjects, but it does imply that dialogue is an ongoing process that is developed over the course of a research project.

Our relationships with people who have been harassed by Gamergate participants is another example of an external relationship which has shaped our understanding of the project. Regarding the ethical dilemma of preserving the evidence of harassment in various forms, we incorporated the input of those involved with the research project, members of the ReFig project, and attendees at various conferences where we discussed our findings. The input from these stakeholders led to our decision to delete some of the data from our corpus. This contradicts the first impulse for many researchers, which is often to keep as much of the research data or material that is collected during a project because of the immense effort that goes into developing such collections, including: deploying and customizing software in order to continuously scrape data; curating websites or documents based on the subject matter; analyzing the artifacts in order to deepen our understanding; and presenting those results to the research community or the larger public. Given this effort, many researchers would be reluctant to discard such a signifier of time, energy, and emotional labour.

Another cross-cutting imperative regarding the preservation of research data is to the research community as a whole. Do we owe other researchers access to our data so that they may study it themselves? There are considerable efforts being made in the social sciences to preserve research data so that other researchers may attempt to replicate the results of a study. For example, social psychologists have been arguing about a ‘crisis of replication’ in their field for the past decade. Recent examples of researcher fraud pose a serious challenge to the legitimacy of a field (Aschwanden, 2015; Bartlett, 2015; Marcus and Oransky, 2015). Sharing research data is one way to address this problem.

However, in approaching Gamergate from the perspective of the humanities, we may have different grounding assumptions about what types of material should be made available during, or at the conclusion of a research project. Within the humanities, there is a tradition of curation in which particular archives or editions are thoughtfully created as part of the research process. For digital artifacts, such as the websites and tweets which formed the core materials for our project, the value of creating an archive for future researchers may simply be to preserve the historical record, but we must also be conscious that the archive is not innocent. Decisions are continuously being made about the selection of materials for inclusion. Moreover, digital material, especially on the web, may be altered for many different reasons such as a publisher going out of business, an editor revising a previous story, or a hacker maliciously removing information. We cannot be sure of the motivations of these changes, but collecting our own record of a website at a given time is one way of showing how people communicated at a point in time.

So, making a decision about whether to keep all of the material we collected during the project involved a series of cross-cutting discussions that involved the different disciplinary traditions of the people involved in the project, outside groups with which we collaborated, and a larger public who may be interested in the results of our study. Deciding whether to preserve artifacts such as data or documents that demonstrated harassment, was one of the hardest questions we faced because of the multiple relationships we had before, and developed during, the course of the study. We had to ask ourselves challenging questions, such as whether preserving evidence of harassment was just another way of perpetuating or repeating that harassment into the future. The very real and immediately present effects of Gamergate on many people have been harmful. We ultimately decided that the harm of adding to the arbitrary preservation of the harassment we saw occurring outweighed the loss of any information that might have been gained by analyzing those particular materials. As researchers make executive choices from conception of a research question through to the final write-up of a project, our choice was guided by an ethic of care and doing ‘less’ harm along the way.

The emphasis of the Ethics of Care on relationships instead of rights informed our decisions about preservation of toxic materials. We engaged in an ongoing dialogue both within the research group and with external groups. We decided that our relationships were more important than an abstract duty to preserve everything we collected. There is a real and perhaps unresolvable tension in this decision. We are still sensitive to the different decisions which other researchers, in a similar position, might have taken.

Any move toward understanding a moral position with which we disagree requires some form of empathy. Cognitive and emotional work must be essayed in order to help both ourselves, as researchers and inhabitants in our various communities, as well as others who may be outside of our communities. The goal of our research is less about persuasion than it is about reporting our interpretations of what we have studied.

Conclusion

In the introduction to this paper, we argued that digital humanities should engage in a broader discussion of ethics, especially as it is applied to the research process for digital materials. In conclusion, we will add some of the lessons learned from our case study and suggest how those lessons can be applied today. Let us begin with the lessons for digital humanities research projects.

For digital humanities in particular, the lessons to be drawn from our case study imply that the discussion of research ethics should be incorporated into the research process during the early stages of planning and data collection. Researchers should be aware of the relationships upon which their work depends. Graduate students and postdocs are in an especially delicate position during a research project because they may not feel empowered to affect the direction of a project. Senior researchers need to be aware of this relationship and work to include all parties in the execution of said project.

At the very least we must begin to think of our projects in terms of the stakeholders involved and the relationships between them.

A basic starting point is examining the project from an internal perspective by describing the relationships between the many people who are actively engaged in data collection, analysis, writing, and archiving.

The next step is to check on the relationships between the research project and the subject community that is involved. Are the research subjects being consulted or respected? Are the researchers avoiding potential harms?

A final step is to look at external relationships beyond the subjects of the study, often this means looking outward to the wider research community for suggestions and openings for dialogue.

Our discussion of research ethics and philosophy was inevitably too brief to give these subjects their due attention. However, we hope to have piqued the interest of digital humanists who may be prompted to delve deeper into these subjects. At the very least, the material presented in this paper will give those unfamiliar with the area some guideposts for further exploration.

The Ethics of Care is a fruitful starting point for discussing ethical issues related to humanistic scholarship because it is rooted in a feminist tradition that asks researchers to pay attention also to the mundane activities of editing and archiving. Relationships with the human subjects who are being studied in a research project are important, but the internal relationships among the research team deserve just as much attention. One of the important ideas touched on in this paper, which we hope to expand on in future work, is the role of infrastructure and the support personnel who maintain it. The roots of the Ethics of Care in feminist scholarship suggest some interesting directions for further work on the gendered nature of various roles, such as editing and archiving.

Regulation of the research process is another important issue that digital humanists can engage from multiple perspectives. The background information and case study presented in this paper will help humanists in this process. In a future line of research, we will investigate the commonalities and differences of this approach with current developments in European Union law and practices. European Union data protection and privacy thinking and legal codification have long been concerned with attempts to balance the rights of different stakeholders (such as freedom of contract or scientific interests on the one hand, and privacy on the other hand) through laws that arguably try to mitigate the power differentials that are unavoidable between data controllers and data subjects (Gutwirth and De Hert 2006). With the risk-based approach and data protection impact assessments built into the General Data Protection Regulation, which became directly applicable law in May 2018, data controllers and data processors are now required to assess, ex ante, possible harms to data subjects and take measures to avoid such harms. These harms will overlap with those we have discussed in the present paper, although they may not cover all of them (the harms inflicted on researchers are an interesting exception), such that Ethics of Care approaches may be able to draw on emerging best practices from impact assessments. On the other hand, even the best-intentioned Ethics of Care approach is not immune to a danger that has been discussed with respect to the risk management approach (e.g. Shapiro 2014): that through the researcher’s, or in general the data controller’s (re-)appropriation of definitional power over what the risks are, pre-existing power differentials actually get strengthened, and the weaker parties’ rights weakened.

Learning from other disciplines is another potential benefit from engaging in the issues raised in this paper. Digital humanities is also part of a larger conversation about the structure of digital scholarship in general. Norms for publishing digital data and research artifacts are evolving throughout the research environment.

At the start of this paper we mentioned the idea of datafication, i.e. the growing importance of data to all types of human endeavour. How does such a large social transformation overlap with the ethical concerns raised by a single research project? We believe that the issues we are currently wrestling with in our own research work will only become more important to the larger world as datafication continues to grow and impact our daily lives in more and more elaborate ways. Issues of labour compensation and gender should be closely connected to larger societal conversations around rights, especially in the digital realm. Changes to the legal regimes in different countries and contexts also need further attention.

Although concerns about the datafication of digital humanities archives have been raised, the analysis so far in the digital humanities has been retrospective and has provided limited guidance for other projects. Incorporating an Ethics of Care bolsters the understanding of power and praxis in the digital humanities and contributes to the emerging discussion of ethical issues across projects.

Notes

1We choose to capitalize the phrase “Ethics of Care” throughout this paper in order to emphasize the specific feminist philosophical tradition upon which we base our analysis.

2“Refiguring Innovation in Games (ReFiG) is a 5 year project supported by the Social Sciences and Humanities Research Council of Canada. Composed of an international collective of scholars, community organizers and industry representatives, ReFiG is committed to promoting diversity and equity in the game industry and culture and effecting real change in a space that has been exclusionary to so many.” (“About ReFiG.” 2018.)

Acknowledgements

The authors would like to acknowledge the members of the Gamergate research group at the University of Alberta. We would also like to thank the librarians, system administrators, programmers, and maintainers without whose work this project would not have been possible. The SSHRC supported the ReFig project in which Florence Chee and Geoffrey Rockwell participate. We thank them for providing an environment where these conversations could begin and flourish. They also supported the presentation of an earlier version of this work at the CSDH/CGSA 2016 conferences.

Competing Interests

The authors have no competing interests to declare.

Author Contributions

Authors are listed in descending order by significance of contribution.

Gutwirth, Serge, and Paul De Hert. 2006. “Privacy, Data Protection and Law Enforcement. Opacity of the Individual and Transparency of Power.” In Pricacy and the Criminal Law, edited by Erik Claes, Serge Gutwirth, and Antony Duff, 61–102. Cambridge: Intersentia.

Presner, Todd. 2015. “The Ethics of the Algorithm: Close and Distant Listening to the Shoah Foundation Visual History Archive.” In Probing the Ethics of Holocaust Culture, edited by Claudio Fugu, Wulf Kansteiner, and Todd Presner, 175–202. Cambridge: Harvard University Press.