A decision of the U.S. Court of Appeals for the 10th Circuit highlights the power dynamics around rights to collect and share data. It marks an important victory for environmental activists, and should also be of interest to all those who engage in citizen science, as well as community-based environmental monitoring.

The case arose after the Wyoming legislature passed a law titled Trespassing to Unlawfully Collect Resource Data that imposed civil and criminal liability on any person who crossed over private land in order to “access adjacent or proximate land where he collects resource data.” The statutory definitions of resource data included all kinds of data gathering activity from taking notes to photographing wildlife or taking samples of soil or water.

The backstory to the legislation involved efforts by environmental activists with the Western Watersheds Project to document the impact of cattle grazing on water quality, and to push for limits on grazing on public lands. These efforts were opposed by cattle ranchers, who apparently carry enough clout to push the legislature to enact such a law. A predecessor statute in 2015, titled Trespassing to Collect Data, created civil and criminal liability for collecting data on “open lands”. After the constitutionality of the 2015 law was challenged, it was amended to prohibit crossing private land without permission in order to collect data on “adjacent or proximate land” (which might be public land). It was this amended version that was considered by the appellate court.

The issue before the Court was not whether there was a broad right to collect resource data on either public or private land. Rather, it was whether the state, by creating new civil and criminal trespass penalties for those who crossed private land without permission in order to collect data on public land, violated the free speech rights of the data collectors. The plaintiffs’ argument was essentially that although there were already penalties for trespass on private land, the statute created additional penalties for those who trespassed on private land for the purpose of collecting data on public land. Thus, the court framed the issue as “not whether trespassing is protected conduct, but whether the act of collecting resource data on public lands qualifies as protected speech.” The court noted that the prohibited acts under the law involved “collecting water samples, taking handwritten notes about habitat conditions, making an audio recording of one’s observation of vegetation, or photographing animals”, so long as location data was also included.

The Court noted that a number of federal and state environmental statutes and regulations provided for public submission of environmental data as part of assessment and decision-making processes. The plaintiffs argued that a law restricting their ability to gather environmental data inhibited their ability to participate in such processes, thus limiting their freedom of speech. The Court agreed, noting that the First Amendment extends to the “creation” of speech. The Court observed that “An individual who photographs animals or takes notes about habitat conditions is creating speech in the same manner as an individual who records a police encounter”. The Court also found that the taking of samples, though “somewhat further afield of pure speech”, was protected. In this case, the samples were characterized by the Court as “information plaintiffs need to engage in environmental advocacy”. The Court also observed that the plaintiffs used the data they collected in advocacy activities, and that this type of political engagement was at the core of the First Amendment protection.

The Court does caution that there is no general “unrestrained right to gather information”. As a result, laws that, by banning activities incidentally prevent the ability to gather information about those activities would not run afoul of the First Amendment. In this way, a general prohibition on trespass does not offend the First Amendment, even if it means that someone would be equally barred from trespassing to gather information. What was problematic here was that the laws created new penalties that specifically applied to trespass for data gathering activities.

Although the legislation in this case might seem to be an outlier product of an aggressive stakeholder lobby of government, the issues it raises have a broader significance. Control over data, access to data and even the ability to create data are all crucially important in our data-driven society. My ongoing research explores issues of ownership, control and access to data – expect to see more posts on these topics over the course of the year.

Note: the following are my speaking notes for my appearance before the Standing Committee on Transport, Infrastructure and Communities, February 14, 2017. The Committee is exploring issues relating Infrastructure and Smart Communities. I have added hyperlinks to relevant research papers or reports.

Thank you for the opportunity to address the Standing Committee on Transport, Infrastructure and Communities on the issue of smart cities.My research on smart cities is from a law and policy perspective. I have focused on issues around data ownership and control and the related issues of transparency, accountability and privacy.

The “smart” in “smart cities” is shorthand for the generation and analysis of data from sensor-laden cities. The data and its accompanying analytics are meant to enable better decision-making around planning and resource-allocation. But the smart city does not arise in a public policy vacuum. Almost in parallel to the development of so-called smart cities, is the growing open government movement that champions open data and open information as keys to greater transparency, civic engagement and innovation. My comments speak to the importance of ensuring that the development of smart cities is consistent with the goals of open government.

In the big data environment, data is a resource. Where the collection or generation of data is paid by taxpayers it is surely a public resource. My research has considered the location of rights of ownership and control over data in a variety of smart-cities contexts, and raises concerns over the potential loss of control over such data, particularly rights to re-use the data whether it is for innovation, civic engagement or transparency purposes.

Smart cities innovation will result in the collection of massive quantities of data and these data will be analyzed to generate predictions, visualizations, and other analytics. For the purposes of this very brief presentation, I will characterize this data as having 3 potential sources:1) newly embedded sensor technologies that become part of smart cities infrastructure; 2) already existing systems by which cities collect and process data; and 3) citizen-generated data (in other words, data that is produced by citizens as a result of their daily activities and captured by some form of portable technology).

Let me briefly provide examples of these three situations.

The first scenario involves newly embedded sensors that become part of smart cities infrastructure. Assume that a municipal transit authority contracts with a private sector company for hardware and software services for the collection and processing of real-time GPS data from public transit vehicles. Who will own the data that is generated through these services? Will it be the municipality that owns and operates the fleet of vehicles, or the company that owns the sensors and the proprietary algorithms that process the data?The answer, which will be governed by the terms of the contract between the parties, will determine whether the transit authority is able to share this data with the public as open data. This example raises the issue of the extent to which ‘data sovereignty’ should be part of any smart cities plan. In other words, should policies be in place to ensure that cities own and/or control the data which they collect in relation to their operations. To go a step further, should federal funding for smart infrastructure be tied to obligations to make non-personal data available as open data?

The second scenario is where cities take their existing data and contract with the private sector for its analysis. For example, a municipal police service provides their crime incident data to a private sector company that offers analytics services such as publicly accessible crime maps. Opting to use the pre-packaged private sector platform may have implications for the availability of the same data as open data (which in turn has implications for transparency, civic engagement and innovation). It may also result in the use of data analytics services that are not appropriately customized to the particular Canadian local, regional or national contexts.

In the third scenario, a government contracts for data that has been gathered by sensors owned by private sector companies. The data may come from GPS systems installed in cars, from smart phones or their associated apps, from fitness devices, and so on. Depending upon the terms of the contract, the municipality may not be allowed to share the data upon which it is making its planning decisions. This will have important implications for the transparency of planning processes. There are also other issues. Is the city responsible for vetting the privacy policies and practices of the app companies from which they will be purchasing their data? Is there a minimum privacy standard that governments should insist upon when contracting for data collected from individuals by private sector companies? How can we reconcile private sector and public sector data protection laws where the public sector increasingly relies upon the private sector for the collection and processing of its smart cities data?Which normative regime should prevail and in what circumstances?

Finally, I would like to touch on a different yet related issue. This involves the situation where a city that collects a large volume of data – including personal information – through its operation of smart services is approached by the private sector to share or sell that data in exchange for either money or services. This could be very tempting for cash-strapped municipalities. For example, a large volume of data about the movement and daily travel habits of urban residents is collected through smart card payment systems. Under what circumstances is it appropriate for governments to monetize this type of data?

The U.S has cleared the way for the use of citizen science by federal government agencies and departments in a new law titled the American Competitiveness and Innovation Act (ACIA) (awaiting presidential signature).

The ACIA as a whole should be of interest to Canadians, as it lays out the principles for how the National Science Foundation (NSF) in the United States should approach its mandate to support scientific research. Earlier bills failed to reach acceptable compromises; some of these would have restricted types of scientific research funded by the NSF to specific sectors. This has echoes of the controversial choices in Canada under the previous government to focus on applied rather than basic scientific research. The American Competitiveness and Innovation Act has moved away from this narrow approach and sets out two main criteria for funding scientific research:intellectual merit and broader public impacts.

The ACIA contains a distinct section titled the Crowdsourcing and Citizen Science Act (CCSA) which paves the way for the use by government agencies and departments of scientific research practices based upon distributed public participation. The CCSA defines citizen science as “a form of open collaboration in which individuals or organizations participate voluntarily in the scientific process in various ways.” (§402(3)(c)(1)) The level of participation can vary, and may include public participation in the development of research questions or in project design, in conducting research, in collecting, analyzing or interpreting data, in developing technologies and applications, in making discoveries and in solving problems. In its preamble, the CCSA acknowledges some of the unique benefits of crowd-sourced research, including cost-effectiveness, providing hands-on learning opportunities, and encouraging greater citizen engagement.

The CCSA specifically empowers the heads of federal science agencies to make use of crowdsourcing and citizen science to conduct research projects that will advance their missions. It enables the use of volunteers in research – something that might otherwise become entangled in red tape. The Act also directs agencies to draft appropriate policies to govern participant consent, and to address “privacy, intellectual property, data ownership, compensation, service, program and other terms of use to the participant in a clear and reasonable manner.” (§402(4))

Significantly, the CCSA also mandates that any data collected through citizen science research enabled under the legislation should be made available to the public as open data in a machine-readable format unless to do so is against the law. It also requires the agency to provide notifications to the public about the expected use of the data, any ownership issues relating to the data, and how the data will be made available to the public. (I note that these issues are addressed in my co-authored guide Managing Intellectual Property Rights in Citizen Science published by the Wilson Center Commons Lab.) The statute also encourages agencies, where possible, to make any technologies, applications or code that are developed as part of the project available to the public. This legislated commitment to open research data and open source technology is an important public policy statement.

One barrier to the use of crowdsourcing and citizen science in the government context is the fear of liability within the risk-averse culture of governments. The CCSA addresses this by proving that participants in citizen science projects enabled under the statute agree to assume all risks of participation, and to waive any claims of liability against the federal government or its agencies.

The CCSA permits federal agencies to partner with community groups, other government agencies, or the private sector for the purposes of carrying out citizen science research. After a two-year grace period, the statute also requires the filing of reports on any citizen science or crowd-sourcing projects carried out under the CCSA, and contains detailed requirements for the content of any such report.

The inclusion in this science and innovation bill of provisions that are specifically designed to facilitate and encourage the use of citizen science by governments is a significant development. It is one that should be of interest to a federal government in Canada that is attempting to carve out space for itself as open, pro-science and keen to engage citizens. Citizen science has significant potential in many fields of scientific research; it also brings with it benefits in terms of education, citizen engagement, and community development.

Municipalities are under growing pressure to become “smart”. In other words, they will reap the benefits of sophisticated data analytics carried out on more and better data collected via sensors embedded throughout the urban environment. As municipalities embrace smart cities technology, a growing number of the new sensors will capture data in real time. Municipalities are also increasingly making their data open to developers and civil society alike. If municipal governments decide to make real-time data available as open data, what should an open real-time data license look like?This is a question Alexandra Diebel and I explore in a new paper just published in the Journal of e-Democracy.

Our paper looks at how ten North American public transit authorities (6 in the U.S. and 4 in Canada) currently make real-time GPS public transit data available as open data. We examine the licenses used by these municipalities both for static transit data (timetables, route data) and for real-time GPS data (for example data about where transit vehicles are along their routes in real-time). Our research reveals differences in how these types of data are licensed, even when both types of data are referred to as “open” data.

There is no complete consensus on the essential characteristics of open data. Nevertheless, most definitions require that to be open, data must be: (1) made available in a reusable format; (2) prepared according to certain standards; and (3) available under an open license with minimal restrictions or conditions imposed on reuse. In our paper, we focus on the third element – open licensing. To date, most of what has been written about open licensing in general and the licensing of open data in particular, has focused on the licensing of static data. Static data sets are typically downloaded through an open data portal in a one-time operation (although static data sets may still be periodically updated). By contrast, real-time data must be accessed on an ongoing basis and often at fairly short intervals such as every few seconds.

The need to access data from a host server at frequent intervals places a greater demand on the resources of the data custodian – in this case often cash-strapped municipalities or public agencies. The frequent access required may also present security challenges, as servers may be vulnerable to distributed denial-of-service attacks. In addition, where municipal governments or their agencies have negotiated with private sector companies for the hardware and software to collect and process real-time data, the contracts with those companies may require certain terms and conditions to find their way into open licenses. Each of these factors may have implications for how real-time data is made available as open data. The greater commercial value of real-time data may also motivate some public agencies to alter how they make such data available to the public.

While our paper focuses on real-time GPS public transit data, similar issues will likely arise in a variety of other contexts where ‘open’ real-time data are at issue. We consider how real-time data is licensed, and we identify additional terms and conditions that are imposed on users of ‘open’ real-time data. While some of these terms and conditions might be explained by the particular exigencies of real-time data (such as requirements to register for the API to access the data), others are more difficult to explain. Our paper concludes with some recommendations for the development of a standard for open real-time data licensing.

This paper is part of ongoing research carried out as part of Geothink, a partnership grant project funded by the Social Sciences and Humanities Research Council of Canada.

Note: I was invited by Canada’s Information Commissioner and the Schools of Journalism and Communication, and Public Policy and Administration atCarleton University to participate in a workshop to launch Right to Know Week 2016. This was a full afternoon workshop featuring many interesting speakers and discussions. This blog post is based on my remarks at this event.

For the last 5 years or so, governments at all levels across Canada have been embracing the open government agenda. In doing so, they have expressed, in various ways, new commitments to open data, to the proactive disclosure of government information, and to new forms of citizen engagement. Given that the core goals of the open government movement are to increase government transparency and accountability in the broader public interest, these developments are positive ones.

There is a risk, however, that public commitments to open government have become a bit of a ‘feel good’ thing for governments. After all, what government doesn’t want to publicly commit to being open, transparent and accountable?As a result, it is important to look behind the rhetoric and to examine the nature of the commitments made to open government in Canada and to question how meaningful and enduring they really are.

For the most part, commitments to open government in Canada have been manifested in declarations, policy documents, and directives. These documents express government policy and provide direction to government actors and institutions. Yet they are “soft law” at best. They are not enacted through a process of legislative debate, they are not expressed in laws that would have to be formally repealed or amended in order to be altered, there are no enforcement or compliance mechanisms, and they remain subject to change at the whim of the government in power. Directives and policies, of course, can provide rapid and responsive mechanisms for operationalizing changes in government direction, and so I am not criticizing decisions to set open government in motion through these various means. But I am suggesting that a longer term commitment to open government might require some of these measures to be expressed in and supported by legislation in order to become properly entrenched.

For example, much effort has been invested by the federal government in creating an open licence to facilitate reuse of government data and information. After a slow and sometimes painful process, we now have a pretty good open government licence. It is based on the UK OGL and is very user friendly compared to earlier iterations. It is bilingual and it can be customized to be used by governments at all levels in Canada (for example, a version of this licence was just adopted by city of Ottawa). This reduces the burden on provincial and municipal governments contemplating open government and it creates the potential for greater legal interoperability (when users combine data or information from a number of different governments in Canada).

But let us not forget why we need an open government licence in Canada.An open licence permits the public to make use of works that are protected by copyright without the need to ask permission or pay royalties, and with the fewest restrictions on re-use as possible. Government works in Canada – and this includes court decisions, statutes, Hansard, government reports, studies, to name just a few – are protected by copyright under section 12 of the Copyright Act. One might well ask why, instead of toiling for years to come up with the current open licence, the government has not shown its commitment to openness by abolishing Crown copyright. It’s not as radical as it might sound. In the U.S., s. 105 of the Copyright Act expressly denies protection to works of the U.S. government without any obvious negative consequences. In the U.S., these works are automatically in the public domain. This legislated, hard law solution makes the commitment real and relatively permanent. Yet as things stand in Canada, government works are protected by copyright by default, and governments choose which works to make available under the open licence and which they wish to provide under more onerous licence terms. They can also decide at some point to tear up the open licence and go back to the way things used to be. Crown copyright in its current incarnation sets the default at ‘closed’.

It is true that some aspects of open government are already part of our legislative framework. We have had freedom of information/access to information laws for decades now in Canada, and these laws enshrine the principle of the public’s right to access information in the hands of government. However, the access to information laws that we have are ‘first generation’ when it comes to open government. The federal Act is currently being reviewed by Parliament, and we might see some legislative change, though how much and how significant remains to be seen. As Mary Francoli has pointed out, there wasn’t really a need for further review – the new government had plenty of material on which to take action in proposing amendments to the Act.

The many deficiencies in the Access to Information Acthave been well documented. For example, in 2015 the Information Commissioner set out 85 proposed reforms to the statute to modernize and improve it. The June 2016 Report by the Standing Committee on Access to Information, Privacy and Ethics on its Review of the Access to Information Act takes up many of these proposals in its own recommendations for extensive reforms to the Act. We are now awaiting the government’s response to this report. Rather than review the many recommendations already made, I will highlight those that relate to my broader point about enshrining open government principles in legislation

The Access to Information Act as it currently stands is premised on a model of individuals asking for information from government, waiting patiently while government puts together the requested information, and then complaining to the Commissioner when too much information is redacted or withheld. Open government promises both information and data proactively, in reusable formats, and without significant restrictions on reuse. While proactive disclosure of information and open data cannot replace the access to information model (which is, itself, capable of considerable improvement), they will provide quicker, cheaper and more effective access in many areas. Yet the Access to Information Act does not currently contain any statement about proactive disclosure. Proactive disclosure – also referred to as “open by default” is not really “open by default” unless the law says it is. Until then, it is just an aspirational statement and not a legal requirement. We see a proliferation of policies and directives at all levels of government that talk about proactive disclosure, but there are not firm legal commitments to this practice, or to open data. And, although I have been focussing predominantly on the federal regime, these issues are relevant across all levels of government in Canada.

A core principle of open data is that the data sets provided by governments should be made available in open, accessible and reusable formats. Proactive disclosure of information should also be in reusable formats. Access under the conventional regime is also enhanced when the information disclosed is in formats that facilitate analysis and reuse. Yet even under the existing access model, there is no default requirement to provide requested information in open, accessible and reusable formats. It is important to remember that it is not enough just to provide ‘access’ – the nature and quality of the access provided is relevant. The format in which information is provided in a digital age can create a barrier to the processing or analysis of information once accessed.

I would like, also, to venture onto territory that is not addressed in the calls for reform to access to information laws. Another challenge that I see for open data (and open information) in Canada relates to the sources of government data. I am concerned about the lack of controls over the use of taxpayer dollars to create closed data. As we move into the big data era, governments will be increasingly tempted to source their data for decision-making from private sector suppliers rather than to generate it in-house. We are seeing this already; an example is found in recent decisions of some municipal governments to source data about urban cycling patterns from cycling app companies. There will also be instances where governments contract with the private sector to install sensors to collect data, or to process it, and then pay licence fees for access to the resulting proprietary data in the hands of the private sector companies. In these cases, the terms of the license agreements may limit public access to the data or may place significant restrictions on its reuse. This is a big issue. All the talk about open government data will not do much good if the data on which the government relies is not characterized as “government data”. It is important that governments develop transparent policies around contracts for the collection, supply or processing of data that ensure that our rights as members of the public to access and reuse this data – paid for with our tax dollars – are preserved. Even better, it might be worth seeing some principle to this effect enshrined in the law.