Colorado State University, Fort Collins, Colorado, US

Russell Scarpino,

Colorado State University, Fort Collins, Colorado, US

Greg Newman

Colorado State University, Fort Collins, Colorado, US

Abstract

Involving the public in scientific discovery offers opportunities for engagement, learning, participation, and action. Since its launch in 2007, the CitSci.org platform has supported hundreds of community-driven citizen science projects involving thousands of participants who have generated close to a million scientific measurements around the world. Members using CitSci.org follow their curiosities and concerns to develop, lead, or simply participate in research projects. While professional scientists are trained to make ethical determinations related to the collection of, access to, and use of information, citizen scientists and practitioners may be less aware of such issues and more likely to become involved in ethical dilemmas. In this era of big and open data, where data sharing is encouraged and open science is promoted, privacy and openness considerations can often be overlooked. Platforms that support the collection, use, and sharing of data and personal information need to consider their responsibility to protect the rights to and ownership of data, the provision of protection options for data and members, and at the same time provide options for openness. This requires critically considering both intended and unintended consequences of the use of platforms, data, and volunteer information. Here, we use our journey developing CitSci.org to argue that incorporating customization into platforms through flexible design options for project managers shifts the decision-making from top-down to bottom-up and allows project design to be more responsive to goals. To protect both people and data, we developed—and continue to improve—options that support various levels of “open” and “closed” access permissions for data and membership participation. These options support diverse governance styles that are responsive to data uses, traditional and indigenous knowledge sensitivities, intellectual property rights, personally identifiable information concerns, volunteer preferences, and sensitive data protections. We present a typology for citizen science openness choices, their ethical considerations, and strategies that we are actively putting into practice to expand privacy options and governance models based on the unique needs of individual projects using our platform.

Introduction

People have been collecting and interpreting observations of the natural world for millennia (Miller-Rushing et al. 2012) and have had to decide who participates, what they observe, and how to manage the resulting information in ways that respect norms, access, sharing, privacy, and ownership (Bernholz and Ormond-Parker 2018). Tens of thousands of years ago, for example, aboriginal communities in what is now northern Australia developed systems related to managing information to help them survive. These systems delegated roles and responsibilities for information management to the people who possessed the skills necessary to perform them (Bernholz and Ormond-Parker 2018). Starting around 2010, in the small aboriginal town of Wadeye, community elders, local museum staff, and scholars began collecting this ancient and modern indigenous knowledge and making it accessible via multiple digital media formats, all while encoding traditional rules of access (Bernholz and Ormond-Parker 2018). By codifying information with a metadata schema that enabled individuals to find only information to which they would traditionally have access, they enabled sharing information in ways that respected traditional norms, values, and levels of comfort (Bernholz and Ormond-Parker 2018).

Over the past few decades, technological developments have facilitated significant growth in our ability to conduct and document observations through citizen science, bringing new challenges to information management and associated privacy (Bowser et al. 2017). We are witnessing exponential growth in community-generated data, information, knowledge, and wisdom arising from citizen science (Follett and Strezov 2015). To support this growth, our team at Colorado State University has been developing and maintaining a platform, CitSci.org, since 2007 to facilitate citizen science project creation and implementation.

CitSci.org enables individuals and communities to create citizen-driven research programs to meet their needs and interests (Newman et al. 2011). The platform is unique among field-based systems by being transparent and customizable. Projects are created with a do-it-yourself (DIY) approach, and the platform supports heterogeneous data related to diverse topics. The entire research process, from asking questions through data collection and analysis, can often be managed entirely by the very people creating projects. Project managers define what they wish to measure, document how to measure it, and build custom datasheets for project participants to collect data in real-time, online, using mobile applications no matter where they may be located.

As developers of this platform, we have witnessed people around the globe engaging in science, action, and policy based on their own interests in their communities and environment. Such engagement offers great potential benefit for both science and society through learning, participation, and action (Brossard et al. 2005; Crall et al. 2012; Frensley et al. 2017; Mathews 2014; Newman et al. 2017; Newman et al. 2012; Theobald et al. 2015). At the heart of the growing citizen science movement are deeply rooted and contextually appropriate values related to information sharing and use, as is evident from recent research (Bowser et al. 2017) and the increasing popularity of open science, open access, open source, and crowdsourcing movements. Yet, risks are also created by these new approaches and technologies as people, their actions, and their data become more visible and vulnerable (Bowser et al. 2017). Tensions arise between the value of information sharing and open access on one hand, and respect for the privacy and sensitivity of information on the other.

Citizen science platform developers find themselves caught in the middle; they must negotiate these tensions efficiently and transparently by offering what they feel are appropriate options for both projects and participants, while at the same time communicating the nuances and implications of each option and the potential consequences of alternative choices. Project participants using these platforms – including community members, educators, scientists, members of the lay public, and other stakeholders – engage in projects in many capacities and may serve various roles within them. These roles can involve setting the research agenda; articulating project governance structures; selecting protocols; collecting, analyzing, visualizing, interpreting, archiving and sharing data; informing decision makers; contributing code to applications; making instrumentation useful for projects (as in makerspaces); changing individual behaviors; and sharing results via social media, to name a few.

Given this breadth of ways in which people participate in citizen science, and the roles they can take on, our team set out on an adventure to develop CitSci.org to accommodate not only diverse questions and topics, but also multiple governance approaches and data access needs. Bernholz and Ormond-Parker (2018) describe four common values that help guide digital data use in the non-profit sector, including voluntary or permission-based participation, recognition of the private rights of individuals, a public benefit mission, and a pluralistic effort to engage diverse participants. Platform design facilitates our ability to achieve and operationalize these values as critical underpinnings of the citizen science agenda.

We operate CitSci.org based on several core underlying ethical principles: Transparency, adaptability, humility, reflection, and what could be seen as our meta-principle of inclusiveness. Our users drive our platform development both directly via our interactions, and indirectly as our team meets to discuss and prioritize next steps based both on user feedback and on our grounding in the science and theory that informs citizen science practice. Over the course of a decade of continual and iterative platform development and improvement, we observed that the ability to customize membership structure and level of information privacy provides an opportunity to respond to both the goals of a project and its ethical needs and circumstances. Despite the current flexibility of our platform, and because we are continually challenged by our users and their needs, we continue to think strategically about ways to accommodate varying needs related to data access, personal information, and privacy protection. While our goal is to be able to serve as many of the unique needs of citizen science projects as possible, there are limits to what any given platform may be able to accommodate, especially given the costs of creating a flexible platform that attempts to meet the needs of hundreds or thousands of heterogeneous projects. Our priorities remain the development of platform options that best meet the most pressing needs of most projects and maintenance of our platform’s long-term sustainability given what we know and have yet to uncover.

As was famously said in 2002 by then-Secretary of Defense Donald Rumsfeld, “… as we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns—the ones we don’t know we don’t know” (Davey Smith 2016). This quote highlights one of the challenges we face in citizen science platform development. There is a complex web of scientific, regulatory, and user/project-defined needs that must be responded to, in addition to the technical expertise and financial resources required to operationalize these needs within a platform. Users may not even be aware of some of these needs – the unknown unknowns – just as we were not aware of many of them when we began developing CitSci.org. The revelation of these unknown unknowns comes with experience. Platforms can expand the opportunity to digitally engage people who are interested in citizen science work, but who may not have the theoretical or scientific knowledge, technical expertise, time, or financial resources required to create a custom platform on their own. Indeed, citizen science should be accessible to the greatest number and diversity of projects and participants possible. Reducing barriers to access builds the social justice of citizen science opportunities and expands the question of “who can do science?” (Ottinger 2010) to include those who may be resource-limited or differently able. Platforms fundamentally determine who can participate and what they can do, providing various protections for people and information, but also raising some important ethical dilemmas.

Based on our support and hosting of hundreds of projects using CitSci.org around the globe, this high-level view has revealed to us the impact of project-level governance models and possibilities, and the diversity of privacy and openness options that projects need and want to adopt. Like other platform developers, we have had to make design decisions that either constrain or bring flexibility to the projects on our platform. Because we strive to support diverse projects in many places on many topics, we have had to wrestle with the many different norms, values, and permissions that projects using our platform require or need to be made aware of, much in the same way that the Wadeye community has adapted and accommodated protocols of information collection, access, use, and sharing over thousands of years.

Goals

While traditional academic research has a long and important history of addressing ethical concerns related to personnel and information management, some of these issues may not be at the forefront of citizen science project concerns at the time of a project’s design. In this paper, we use our decade of experience developing CitSci.org and engaging with the projects that we support to offer guidance related to project governance and privacy as seen through the lens of platform developers. We discuss the ethical challenges that we have faced during our ongoing platform design and development adventure, and our resulting thoughts on ethical platform design and use for citizen science.

This paper covers the difficult intersection of theory and practice: How do we develop a platform that will succeed in both moving citizen science forward to meet underlying ethical requirements of all of our teams—and of science—while helping the greatest number of projects possible to do great science? What ethically must be protected and what must be made available? Who is in the driver’s seat? By contributing to the discourse of citizen science theory and practice, we hope to demonstrate our commitment to transparency by recognizing that our platform is not yet what we aspire it to be. We also want to acknowledge that being responsive to user needs is difficult: Even though we are aware of some of these needs given our experience in science, theory, and process, we have not always been able to meet them due to limited time and resources.

We hope that this discussion will be useful for both other platform developers and project managers to help them assess the potential positive and negative trade-offs of opening or restricting information and participation, and to make decisions about who is granted power to make choices within their projects. More specifically, our goals are to reflect on our lived experience with our platform development and collective projects to: 1) Provide a conceptual framework for making decisions related to citizen science project governance; 2) Create a typology of citizen science project openness; 3) Discuss examples of how CitSci.org is working to address these scenarios; and 4) Offer recommendations for project managers regarding actions they can take during project design and implementation to create the most rigorous and ethical projects possible. We hope to contribute to answering the overarching question, “How can citizen science practitioners balance their project’s unique aspirations and goals with contextual issues related to data governance, openness, and privacy, while acting ethically to protect information and people?”

Citizen Science Governance

Citizen science platforms are being created to host diverse project types being carried out by diverse leaders ranging from members of the lay public to highly trained researchers. Each project needs to be able to justify its membership and data governance structure, because this is the structure that operationalizes ethical decisions. This structure frames options related to how members are identified, recruited, and allowed to participate; what scientific data are collected where and when; and what data (inclusive of metadata – the data about the data) are shared. Here, we list some of the many possible governance scenarios as examples:

Scenario 1A project may be created by a community group (formal or informal) that is concerned about a local natural resource such as water quality in a local lake. The project members may choose a leader who acts as the project manager, and local NGOs may be invited by the community to participate or assist with establishing project and data collection guidelines based on expertise that they may bring. All members may have equal access to all data.

Scenario 2A project may be created by a local community NGO in partnership with a researcher from a governmental organization such as a state natural resources department. The community organization that created the project may choose to have all data made available to all members of the project and the public, so long as these data are not sensitive.

Scenario 3A project may be created by a university researcher who recruits community members to participate in data collection that contributes to studying a research question related to a shared interest. The researcher may have primary access to all data collected by and about members. Alternatively, a project manager may have access to all research data, but not to the members’ personal private data unless individual members elect to share such information. The members themselves may have access to the subset of the data that they were involved in collecting, but not the full dataset, or they may be able to label a particular observation as sensitive (such as the location of an endangered species or sensitive human subjects data collected from project participants).

Given the complexity of operationalizing these choices within a platform, the governance structure decision is higher-order than the individual choices themselves. We use these three scenarios to demonstrate the variability in citizen science project goals and participation that affect how choices related to data management can be processed and made.

A Citizen Science Governance Framework

From our experience developing CitSci.org, we believe that decisions related to governance –that is, the balance of decision-making power regarding who can make an information sharing choice about what information to share and when to share it—are usually best left to each project. Our platform thus offers flexible options. Here, we work through our development of approaches to Member Personal Privacy Choices, Project Membership Openness Choices, and Project Data Openness Choices based on a project’s Governance Framework. We find that choices related to project governance and the roles of different project members generally occur within the realms of people-related and information-related decisions, and along a top-down versus bottom-up continuum.

Information-related choices encompass privacy of both member personal information and scientific observation-based data, as well as access to and ownership of these data (which information is shared and how).

While member personal privacy choices can be classified as both people-related and information-related depending on how personal data are being used (Figure 1), we will discuss member personal privacy choices as a distinct class of decisions, because personal privacy choices will not normally affect project structure and objectives, and are sensitive at the individual member level rather than at the project level.

Figure 1

Citizen Science Project Governance Framework showing key decisions related to people and information that determine project membership openness, member personal privacy, and project data openness. Platform governance can present a more flexible bottom-up model (more choice) or a more rigid top-down (less choice) governance model. A platform’s governance model will determine its flexibility to accommodate projects with diverse needs by either dictating a single model or offering choices to participating projects and/or its members.

A top-down structure anchors the locus of project decision-making control in the hands of the platform developer, leaving fewer decisions to be determined by participating project managers and members. An example would be a platform that has a default global “open” setting; all projects that participate on the platform would be required to make their data available to the public, which is not appropriate for all citizen science projects and therefore would restrict participation on the platform. For a bottom-up approach, decisions about data sharing are placed in the hands of project managers, providing options and flexibility for platform users to determine how they handle their own data sharing choices.

Here we focus on three key choices that can be offered to platform users, within the context of the governance models that determine who can make them. Figure 1 portrays the three key choices that platforms can offer: 1) What member information is visible and to whom (henceforth member personal privacy); 2) Who can join a project (henceforth project membership openness); and 3) The visibility of project data (henceforth project data openness). As platforms are designed and developed, considering with whom the locus of control over these choices lies is incredibly important, because—as we have learned firsthand—the flexibility to make such choices must be programmed from the outset into the platform itself, and it can be extremely challenging to retrofit choices into a platform with existing users. Through platform design, choices can be operationalized for member personal privacy, project membership openness, and project data openness by using toggles at varying platform levels of operation, customizing the governance of each individual decision. Project structure may be more closed for some choices and more open for others, creating a hybrid model of openness. Location of these toggles determines who has control over each openness decision.

Awareness of these three key decisions is important both for platform developers who design the underlying structure and user interface as well as for platform users. In addition to goal-related needs, contextual situations that lie outside of an individual project’s control may inform or dictate choices (e.g., institutional regulatory requirements, human subjects IRB review, and sponsor/funder requirements). Evolving regulations such as the Global Data Protection Regulation (GDPR), data sharing policies of US agencies, and recent social media data breaches (Bloomberg 2018; Mele 2018; Rosenberg et al. 2018) have revealed new challenges requiring improved clarity and transparency. Each project group must assess both its goals and externally driven contexts to structure the project accordingly. The more flexible the options available, the more nimble a project can be. Next, we present detailed considerations for each of the three classes of key decisions.

Member personal privacy choices

The first key choice relates to member personal privacy, demonstrating the common value of consent and permission. Decisions related to what personal information is collected from project members may be determined at the platform level, the project level, or both. Assuming that legal and regulatory laws and policies are being followed, we suggest answering these questions during the project planning and design stages: 1) Which personal information is necessary to collect, 2) What should the default settings for sharing personal information be, and 3) What personal information sharing choices should be put into the hands of the volunteer? We find the most parsimonious solutions are those that focus on collecting only personal information necessary to ensure quality and integrity of the project and volunteer management; and on giving volunteers governance to choose how they are identified within the project and which data are shared, with whom, and when. Classes of potential viewers of this information include the project manager, fellow project members, other registered platform users, and the digitally connected global public. This diversity of questions may be overlooked by projects led by individuals lacking experience with them, which may set up such projects for imbalance between project-related goals and the personal privacy needs or wishes of individual project members.

Personal data can be documented either as personally identifiable contributions (those that display contributor true full names) or as anonymous contributions (those that obscure personal identifiers through the removal of the last name or creation of an anonymous user name). Personally identifiable information may be used for volunteer management when it is necessary or desirable to know specifically who project data contributors are, or when contributors would like recognition for their contributions. For example, being personally identifiable was likely a huge boost to Hanny, the Dutch school teacher who was recognized for her discovery of Hanny’s Voorwerp while volunteering with Galaxy Zoo (Clery 2011; Lintott et al. 2009). If she had hidden her identity and been completely anonymous, she might not have been recognized and received credit. A hybrid approach would allow volunteers who are known by name within the confines of their project to be anonymized for the digitally connected global public.

We believe that defaulting to an anonymized user name for public view—while offering the opportunity to display a full name—is a best practice for personal privacy protection, reiterating a recommendation made by Bowser et al. (2017). Ultimately, the goal is to protect the project members and their personal privacy at a high level of protection or anonymity. There are valid arguments both for and against the collection and sharing of personally identifiable information, and each project needs to assess what is appropriate given its goals, objectives, and visibility. Many volunteers in the citizen science context, for example, seem more willing to share their personal information and are less concerned about privacy than in other contexts (Bowser et al. 2017).

Factors influencing decisions about member personal privacy also relate to the sensitivity of information contained within a project’s volunteer database, or the technologies used for data collection and the alignment of these technologies to the project mission (Bernholz and Ormond-Parker 2018). This is of great public concern given recent large data breaches and mishandling of personal data by organizations and third-party users (Bloomberg 2018; Mele 2018; Rosenberg et al. 2018), yet the public may be becoming desensitized to these risks (Vance et al. 2014). Not unlike other types of digital platform-based networks, citizen science projects may be at risk of personal data breaches or data mishandling, underscoring the importance of establishing thoughtful and proactive member personal privacy protection policies. Guidance from platforms in raising questions related to the value of consent and permission may help project managers to make appropriate choices.

Contextual requirements, regulations, laws, and policies that guide platform developers, project managers, and others responsible for structuring or managing databases of participant information also must be taken into consideration. These requirements vary around the globe, and include sponsor-driven, organizational, national, and international policies. For example, projects with members living within the European Union (EU) must comply with General Data Protection Requirements (GDPR EU 2016/679). In addition to complying with contextual policies, project managers must consider the personal information privacy policies of third-party technologies used for project management and communication (e.g., document sharing platforms, social media tools, email campaign tools, and platforms such as CitSci.org). These technologies can present additional risks for revealing volunteers, which may conflict with a citizen science project’s mission. Thorough consideration of third-party toolkits’ data sharing policies is recommended prior to adopting their use (Bernholz and Ormond-Parker 2018).

Project membership openness choices

The second key choice for platform development is project membership openness, which relates to the degree to which participation is open to all members of the public. We use three general classes of openness in creating our typology. The most open and accessible projects are “crowdsourcing” projects that allow anyone, anywhere (with access to the project and interest in participating) to participate and contribute observations or perform citizen science tasks. The most restricted projects operate on an “invitation-only” basis, with project managers targeting potential members who are desirable due to expertise, education, location, professional connection, or other criteria. Falling between these two extremes, other projects allow interested individuals to request to become members, with oversight over this decision and its criteria being in the hands of the project manager or other designated individual or group.

Project data openness choices

Choices related to project data openness relate directly to data and metadata protection, privacy, access, and ownership, as well as to decision-making governance. Citizen science project managers must grapple with the serious questions surrounding which and how much data collected by members should be viewable, and with whom these data should (or may) be shared. This concept relates to the need to recognize individuals’ control over their data and their associational and expressive rights.

In field-based citizen science, many objects represent “what” data are being collected. These commonly include observations, locations, species, and individual measurements (including photos). Access to each of these objects can and should be considered individually and should respect various rules that dictate who can access them and for what purpose. In citizen science, there are compelling reasons to ensure that data collected by a particular individual should at the very least remain visible to that individual to “close the loop” between science and citizen scientist (Nov et al. 2011).

Data protection is about securing data against unauthorized use, whereas data privacy and access focus on who has data, who defines it, and who uses it. All protections placed on data involve those who impose the protections, and thus bring about issues of data ownership (and associated licensing, where applicable). More rigid and prescriptive top-down platforms make a predetermined choice of whether the platform, and therefore participating projects’ data, are open or closed to public viewing and use, and do not allow projects to govern this choice due to platform inflexibility. More flexible platforms allow choices related to project data openness to be made by the project manager and/or participants, and possibly at multiple structural levels from the entire project to individual cells in a database. Such selections can involve opening or closing an entire project, opening up project metadata (information about the project) for public viewing while keeping scientific data closed, giving the flexibility to close sensitive portions of a project (such as a data subset), or potentially allowing flexibility in project data openness to be set at the micro scale of individual columns or cells (individual data point) within a particular data set. We have come to believe that providing options to open or close data at the level of the individual data point brings an added benefit to a project, as data points that otherwise may have been left uncollected for fear of their being exposed, may instead be collected and protected, leading to a more complete and representative data set for analysis and even reuse.

We have experienced arguments for the need to specify that all data at one project location must be kept private, and that other data within the project should remain fully open. Species observations often require special attention given various regulations and laws such as the Threatened and Endangered Species Act of 1967, which legislates protection of species and, by extension, mandates protections for the whereabouts of these species of concern. For individual measurements such as photos and attribute types, there are circumstances where some measurements may need to be open access, while others need to be accessible only to project members, and still others accessible only to those contributing them. Human subjects information including personal identifiers such as name and data about the individuals who are subjects of research projects (as opposed to project members) may be especially sensitive in cases of health-related or participatory citizen science projects (Kounadi and Resch 2018). The combination of project membership openness and project data openness in citizen science drives the determination of how open or closed a project is to the public. At CitSci.org we recognize that the requirements for each project will be different according to its goals and context, and recommend that projects select or build platforms that will accommodate their project’s needs.

A Typology of Citizen Science Project Openness

Here, we build on our discussions of citizen science participation and privacy to present a citizen science openness framework, a typology of project membership openness and project data openness for citizen science platforms and projects. We introduced the term “openness” to consider these two characteristics of projects as occurring along a gradient, and suggest that citizen science project managers may want to evaluate where their project(s) fall along the membership openness and data openness gradients, and where their optimal placement would be based on project objectives, sensitivity, and other criteria. Choices related to membership and data openness will help to guide these decisions. These two dimensions of project governance form part of our core CitSci.org platform structure and functionality, a structure that accommodates projects from the most closed project type (both closed data and closed membership) to the most open (open data and open membership – the crowdsourced format). Each of the dimensions also has intermediate levels of openness, nuances we continue to be pressed to work toward in future releases of CitSci.org.

We created this typology as a conceptual representation of the requests by different projects to allow different levels of openness. Open membership and open data are common terms in contemporary scientific inquiry, including citizen science. However, we argue that due to the diversity of citizen science project objectives and inherent sensitivities or project structures, the goal should not always be to become more open, despite recent trends leading in this direction. Projects may choose to keep data private or to keep data open for many reasons, and these rationales will be project-specific. We emphasize that the typology does not impose judgment on projects for where they may lie along these gradients. The important thing is that each project be designed to best meet its own needs. Where issues may occur is when a project that needs to be placed in the “closed” realm of the typology due to sensitive data or other contextual reasons is created on a platform that requires data to be “open” or where there is another similar mismatch in project-specific needs versus available options/capacity. We encourage project managers to consider this typology when assessing their project needs so that they can either develop their own appropriate custom tools or platform, or find a platform that is flexible enough to accommodate their needs.

Typology descriptions

The typology of citizen science openness is portrayed as a grid of “Project Membership Openness” on an x-axis vs. “Project Data Openness” on a y-axis (Figure 2). Each cell presents a unique combination of membership and data openness, labeled with a letter code that designates a specific combination of “Open,” “Partially-open,” or “Closed” membership and data openness, leading to nine unique combinations. As project managers design and develop their projects, they will likely identify with one of these descriptive combinations as being most appropriate for their particular project. Having the ability to customize the openness of their membership and data allows them to develop their project to best meet their needs and goals.

Figure 2

Typology of citizen science project openness is determined by a combination of Project Membership Openness on the x-axis and Project Data Openness on the y-axis, creating a 3 × 3 grid of blocks, each with their own openness classification. “C” represents “closed,” “P” represents “partially open,” and “O” represents “open” status for both membership and data. Each cell is a unique combination of these three classifications for the two axes. The most open projects would be classified as “Open, Open,” or “O-O” and the most closed as “Closed, Closed,” or “C-C.”

To operationalize a diverse array of project openness capabilities, we built on our conceptualizations of user governance to place toggle switches (currently termed “privacy” for project data openness, and “membership” for project membership openness) for project managers to make openness choices (Figure 3). We developed a tooltip (an information button that displays additional help when hovered over or tapped) so that when project managers are faced with a project structure decision, they can learn about each choice and consider for themselves which option would be best for their particular needs.

Our project membership openness options include the full breadth from Closed Projects (Invitation-Only), currently implemented using the member-based selection combined with an “Invite Members” tool; Member-Based Projects that sit in an intermediate realm and require project manager approval to join; and Open Projects that follow a crowdsourcing model and which are our newest project type.

CitSci.org’s project data openness choices are more complex, and full choice selection is still in development. Our most open privacy setting is the Public Project setting, which makes project data accessible for viewing, querying, and other platform-based exploration by anyone, including members of the lay public and those not registered on CitSci.org, while restricting data downloads and formal data use to those who are registered CitSci.org users. Our intermediate openness setting for data privacy is the Private Project setting, which allows viewing, querying, downloading, and other access options for project members only, hiding data from all non-project CitSci.org members and the public. We are currently developing a Fully Restricted Project data privacy setting, which will limit data access to only those project members who collected the data and project managers or scientists who are leading the project, while excluding other project members and the public from access. Although this choice may seem uncommon in environmental field-based scenarios, it may be necessary for sensitive data such as health reports or personal information that could compromise the privacy, safety, or security of research subjects. These protections will be necessary if projects are to go through the human subjects institutional review board process, as anonymity is a foundation of human subject protection.

Any combination of these project membership and data openness choices can be applied to customize projects to be more closed where privacy is required, or more open where visibility, access, sharing, and broad collaboration are desired. Thus, rather than requiring blanket selections for our entire platform, we made it possible to mix and match open and closed settings as necessary. This approach has been critical to enable CitSci.org to meet the needs of our evolving client user base and their diverse project needs and specifications. By putting the selection of membership and data openness into the hands of our users, we are both creating a platform that meets the needs of a diverse project base and giving governance over those decisions to our users.

In addition to these existing settings, we are currently developing tools to expand our sub-choices within the realm of partially open by providing “open-close” toggles at different levels of the platform and at different levels of data representation. For example, we are developing an option that will allow project managers to specify whether access to all data submitted using a specific datasheet is to be open, closed, or of intermediate openness. This will give project managers the ability to close some data sets while leaving access to others open.

We also are developing an option to select the openness of access to data submitted for specific species. This option allows project managers to protect the data related to sensitive species specifically, hiding all data related to the species so that the data cannot be exploited and the species potentially harmed by revealing observed locations. Finally, we are also developing options to allow both project managers and members to choose whether specific observations (a single column in a database, e.g., all observations of water temperature), specific locations (a single row in a database, e.g., all observations made at a specific study site), or individual measurements for an observation (a single cell in a database, e.g., a single measurement of water temperature) should be open/closed/partially open. A few of these existing and envisioned future settings are illustrated in Figure 4, which illustrates global project openness choices as well as future capabilities to devolve data openness decisions to individual project participants. This choice may be desirable when revealing a location publicly would reveal the location of sensitive data such as a threatened and endangered species or a private residence.

Figure 4

Existing and envisioned platform capabilities as seen on the Stream Tracker Project Profile page as an example of operationalizing decisions on membership and information privacy choices. This is a fully open project, as seen by the “Open” icon and open padlock “Public” icon to the right of the project summary statistics. The Stream Tracker project manager chose Open Project Membership and Open Project Data for maximum participation and usability of project data. This project’s decisions were operationalized via an existing series of two toggle switches presented to the project manager during project creation. Also portrayed are future envisioned features (open and closed padlocks along with “Request to see” links denoted by a location marker icon) shown for two observation locations entitled “Plot 2” and “Plot 3” that we plan on implementing in the future. These envisioned features will allow project managers and citizen scientists to choose accessibility of specific observations at specific locations within the project. For example, the project manager for this project may set a project-wide setting indicating that all observations are to be made publicly available (again, as evident by the open padlock to the right of the project statistics at the top of the profile), but in this case we can see two observations that have been selected to be kept private by those making the two observations. We also can see that one observation is visible, or open, to the person logged in (User B; see top right) because they made this observation (Plot 3), but is private to others. A second observation (Plot 2) that another user made can also be seen, however this location was kept private such that the specific location coordinates are not viewable by User B. If restrictions on visibility were actually put into practice for this project (they have not been), then this project’s Project Data Openness would become “Partially open,” and the project would move from an “O-O” position in the openness typology to “O-P.”

Our ultimate goal at CitSci.org is to make citizen science—good and well-thought-out citizen science—accessible to the greatest number and the greatest diversity of potential citizen scientists and their projects. We want people to be inspired to take part in science, and to have tools and guidance available that will help them to make progress toward their vision. By creating a platform that engages the greatest diversity of people and projects possible in both the use of the platform and in its user-driven design, we as hosts support the fourth of Bernholz and Ormond-Parker’s (2018) common principles for digital data use, pluralism.

CitSci.org hosts hundreds of projects. Here we present four projects that fall at different locations on the Citizen Science Project Openness Typology (Figure 2) to illustrate the choices they made, the rationale for these choices, and how the projects used CitSci.org to operationalize them. These projects serve as demonstrative examples of our project openness typology, and their teams have agreed to share them for the benefit of the citizen science community.

Stream Tracker

Stream Tracker is a citizen science project funded by the National Aeronautics and Space Administration’s (NASA) Citizen Science for Earth Systems Program. Stream Tracker studies intermittent streams—i.e., streams that do not flow all the time. Such streams are important for forecasting water supplies, mapping critical aquatic habitat, and understanding how streamflow conditions change over time. Stream Tracker uses the CitSci.org platform to collect data on these previously overlooked streams by crowdsourcing volunteer observations of when and where intermittent streams are flowing.

This project has chosen an open, crowdsourcing-style membership and open data structure, thus placing it in the “O-O” (Open Membership, Open Data) block in Figure 2. These program structure openness choices were made by the project science team, including the lead principal investigator and the volunteer project manager at the proposal stage, and were accommodated by our new crowdsourcing capability. Membership and data are open to all who are interested. For this project, open membership and public access to open data are critical given the goals, objectives, lack of privacy concerns, and context of the project. This format has allowed the project to grow rapidly from its original focus on a single watershed in Colorado to 29 states—growth that was not anticipated when the project was originally conceived to advance understanding of intermittent stream flow within a single watershed.

Off the Roof

The Off the Roof project developers came to CitSci.org seeking support to help organize and centralize data submitted by volunteers pertaining to the quality of water collected from their roofs using rain barrel runoff collection systems. Increasing demands on diminishing water supplies and the movement for urban areas to use more local water supplies has intensified interest in alternative water sources. However, lack of data on potential human health risks of these alternative water sources and treatment required to meet water quality standards has impeded use of these sources for both potable and non-potable applications. Data on pathogens in roof runoff is limited due to the need for a rigorous sampling campaign encompassing a large number of roofs in multiple regions and the complexities associated with measuring human pathogens. Our scientists facilitated the team’s project design development, which enabled them to have volunteers gather data on the circumstances surrounding water collection events, while also allowing the project team to append data to these observations once pathogen analysis results were received from laboratories. This ability to add information to observations after laboratory analysis is critical to the project.

In this example, decisions about who can participate and why were made by the project research team and were guided by the unique research questions related to how different roof materials might affect pathogens in collected water. In this case, a broad recruitment strategy was used to identify anyone in target cities who may be interested in participating, but the research team then used filters and criteria based on reported roof types and proximity to target city universities to select members. For membership, this project chose the closed member-based structure because they needed to carefully vet participants based on the project’s strict criteria. The project also chose to make all project data private and accessible only to project members, thus placing it in Typology Block “C-C” (Closed Membership, Closed Data) in Figure 2. The Off the Roof team chose to create separate projects for each of their target cities to further restrict member data access only to the city project to which they contribute.

This project encountered a privacy-related situation because data were being collected at individual member households. The project team initially decided to create predefined locations using the addresses of participants as data collection location names, unintentionally disclosing individual member household locations to other project members. When the project design team later discovered that project members did not wish their home addresses to be disclosed (see related discussions and examples in Bowser et al. 2017), they decided to code their household names (e.g., “Household 1”) while holding the address private. They also reduced the precision of the latitude/longitude coordinates to avoid disclosing actual household locations via location coordinates and mapped points. Instead, they decided to keep an offline key of the exact coordinates of the household location for the project manager and data analysis. Finally, when sharing project results with members, the team has chosen to share summary statistics by city only to keep individual household location data private. Our experiences with this project have motivated the development of new features that will allow individual project members to choose whether they wish their individual default locations (possibly household) to be shared with other project members—a choice that will be able to be layered on top of the choice made by project managers regarding openness of project data project-wide.

Mountain Goat Molt Project

The Mountain Goat Molt Project is aimed at studying the effects of climate change on the phenology (timing) of mountain goat winter coat molt (shedding). This project encourages people to submit photos of mountain goats (cold-adapted, alpine species) and to report on the degree that goats have shed their coats to help scientists study the effects of climate warming on the coat molt phenology. Membership is set to be open, and data are publicly accessible. This project falls within Block “O-P” (Open Membership, Partially-open Data) of our Openness Typology (Figure 2). However, the project would benefit if features were available that allow individual photographers to preserve the copyright of their personal photographs of goats, while at the same time placing additional attribute data related to an individual photograph (such as the degree that the coat has been shed) under a more open-access Creative Commons license. If this feature existed, we would allow professional photographers to participate in this project without fear of photo copyright violations. Thus, the copyrights to photographs submitted would remain with the contributor of the photographs, rather than being transferred to the platform or the project. Other attribute data such as coat molt estimates would remain open data usable by others, copyright-free. We plan to develop several of these options for platform-wide availability in 2019, which will allow us to better support partially-open project data structures, as well as make improvements to CitSci.org broadly to better support the needs of this unique project. This will include an integration with the Zooniverse platform for more streamlined image classification in parallel with image submission and associated data entry.

Front Range Pika Project

The Front Range Pika Project (FRPP) is designed to collect data about the American pika (Ochotona princeps) across the Front Range of Colorado. The project was created and is managed through a collaboration between Rocky Mountain Wild and Denver Zoo, with assistance from pika researchers at the University of Colorado, the Natural Resource Ecology Laboratory at Colorado State University, Colorado Parks and Wildlife, and Rocky Mountain Biological Laboratory. The FRPP partners are working with other regional citizen science projects to collect consistent, rigorous, and usable data on pika across Colorado. This project chose semi-open membership where anyone can request to join but must be approved by project managers. Approval is based on attending a required training program that consists of an in-class training session and a field training session to ensure data quality that meets rigorous scientific standards. This project closely aligns with Block “P-O” (Partially-open Membership, Open Data) (Figure 2). By choosing to have data be fully publicly accessible, the project has benefitted by attracting greater collaboration than was initially anticipated. Once started, other similar organizations took note and replicated key aspects of project openness (e.g., the Cascades Pika Watch Project) and protocol (e.g., PikaNet—a project led by the Mountain Studies Institute). This shows the power of open access data and open science as related to sharing not only data, but also governance choices and data collection protocols, in an open platform context.

Recommended Key Questions

Given the complexities associated with choices related to project membership and data openness, it can seem daunting for project managers to design and implement projects. Here we provide a framework to aid in the design and use of platforms to best support project needs related to governance and openness, and associated questions to ask early in project development, reiterating that the moral benefit of one choice versus another is not being promoted or challenged here. Rather, each project must make decisions based on its project-specific needs. Note also that situations change, so initial decisions may need to be modified based on changes to the answers, possibly necessitating platform structural changes. Choosing or developing a platform that will allow for customization is important, if platform default settings are not appropriate for the project.

We summarize our paper with four key questions, and associated sub-questions, which all citizen science projects should ask when setting up a project and choosing a platform. The same questions may be asked from a developer’s perspective when envisioning the user audience’s potential needs (Figure 5).

Figure 5

Project openness and governance decision trees, detailing the four key questions that platform developers and platform users need to ask to ensure that platform structures meet the needs of the project and why. Within each decision the options from left to right indicate increasing openness, devolution of governance, or potential fineness of the level of the decision. “C” denotes “Closed” projects, “P” denotes “Partially open” projects, and “O” denotes “Open” projects.

Project Membership Openness: Who can join the project and why?The answer to this decision will depend on both platform structure and project needs. Some platforms default to either open or closed membership, and project needs must be considered when selecting or designing an online platform for the project. When a project is looking for as many participants as possible with no restrictions, an open platform is desired. When there are criteria of location or participant qualifications, then the ability to partially or fully close membership is more appropriate.

Project Data Openness: Who can see or use project data?The answer to this decision will depend on some sub-questions, such as:

Will members be collecting data about sensitive species?

Will they be collecting data related to human subjects?

Will members be collecting data on private lands?

Would public sharing of elements of the data have the potential to put members or species at risk?

Does the project have permission to share the data with the public?

Are there appropriate and adequate member personal privacy policies in place?

Will data be shared as individual observations or data points, or can data be shared in an aggregate format, such as a density map or in reports containing only summary statistics as results?

When sensitive data are anticipated, project managers will want to ensure that their platform will accommodate the needs of their project to hide data from the view of the public. If the project’s or sponsor’s aim is to contribute open and accessible data to global networks for broad use and re-use, then it will be important to structure databases to be maximally open, while working with the sponsor and platform to protect data too sensitive to be revealed in raw form.

Project Governance: Who can make project openness decisions?The answer to this decision will depend upon the structure of the project’s platform. Currently, many platforms have a default governance structure that dictates who can make choices related to Questions 1 and 2, so that the platform makes the decision, or only the project manager can make the decision. If there are research sponsor requirements, project manager concerns, or even potential project member decision governance concerns over project openness choices, the platform needs to be able to be either custom designed or flexible enough to accommodate these choices.

Project Data Sharing Decision Levels: At what structural level within the platform should the decision be made?The answer to this decision may be hardwired into a platform’s structure so that all data are open or all data are closed, with no option to make choices at different project structural levels. Alternatively, as in the case of CitSci.org, project managers may be able to assess their needs based upon whether they anticipate having no sensitive data, sensitive data occurring throughout the project’s structure and observations, or the potential for unknown sensitive observations to be recorded.

Our general recommendations are to structure projects to be conservative with the sharing of information pertaining to project participants (see Bowser et al. 2017), and to allow the needs and goals of the project to guide decisions related to project membership and data openness that make the most sense to the scientific endeavor. Putting the choice in the hands of individual participants allows the locus of decision making to sit with the individual. Providing information on the benefits and drawbacks of individual choices that are offered allows participants to make informed choices based on their comfort levels. These collective choices and actions contribute to the ethical conduct of science by trained and untrained citizen scientists alike.

Conclusions

We are in an era of growing citizen science data generation. It is important that platform designs build on traditional wisdom and information curation systems like that of the Wadeye, with structures in place to meet the needs of twenty-first century science and twenty-first century technology. CitSci.org has worked to operationalize governance and openness by incorporating options into the platform that allow project managers to customize their project governance and protections according to the needs of their project, volunteers, study subjects, and greater contexts. We have done this in an adaptive and iterative fashion, developing CitSci.org as we face new requests from users, and as our own knowledge and engagement in the theory and practice of citizen science, as well as our concern for meeting the needs of our users and identifying what those needs actually are, continues to grow. Our work to meet our own high standards continues, and will continue as we are faced with new capabilities, new platform and project scenarios, and developing needs of our new and existing CitSci.org users. We are grateful to all who continue to push boundaries to make us all better.

Citizen science project managers are change makers, regardless of the top-down or bottom-up governance decisions that they make for their projects. If decisions are made thoughtfully, with full consideration of a project’s unique goals and needs, then projects will be poised to have greater impact in the world. It is this full consideration of a project’s unique goals and needs that ultimately will lead to the most ethical process and the greatest success and impact by harnessing the power of citizen scientists who want to participate in science and contribute to change.

Acknowledgements

This material is based in part upon work supported by the National Science Foundation under Grant Nos. 1339707, 1550463, and 1817612. The CitSci.org team would like to thank Colorado State University and the Natural Resource Ecology Laboratory for supporting CitSci.org. Our users are our greatest contributors to platform design and to our learning about on-the-ground technical needs of citizen science projects. Dani Lin Hunter, Ellen Eisenbeis, and Danielle Backman conducted and analyzed user surveys, and Alycia Crall and Jim Graham contributed to early CitSci.org development. Two anonymous reviewers and journal editors Rick Bonney and Lisa Rasmussen provided constructive feedback that led to significant improvements to this manuscript. To all we are grateful.

Kounadi, O and Resch, B. 2018. A Geoprivacy by Design Guideline for Research Campaigns that Use Participatory Sensing Data. Journal of Empirical Research on Human Research Ethics, 13: 203–222. DOI: https://doi.org/10.1177/1556264618759877