Do I need a lawyer to hack in DC?

New city data catalogs are coming online all the time, but what many civic hackers may not have noticed is that we’re often agreeing to complex legal conditions every time we make use of the datasets in these catalogs.

As a result of these restrictions, DC’s open data isn’t truly open, but rather open only as long as the government approves of its use. The specter of a lawsuit hanging over the heads of civic hackers has a chilling effect on the creation of projects to benefit the public, even though they make use of public data released for that express purpose. How does this happen? Through terms of service, terms of use, and copyright law.

Prohibiting a lawsuit

DC’s new data catalog at opendata.dc.gov, which launched in August, requires civic hackers to agree to the District's three-page terms and conditions of use before using any public data. The terms include this wide-reaching provision:

“you will not [u]se the Site to . . . encourage others to engage in any conduct which would constitute a criminal offense or give rise to civil liability”

I had to check with lawyers about what “give rise to civil liability” means. It is vague, I’m told. So vague that anything might be considered to give rise to civil liability, but a lawsuit against the District is probably what was intended. That means if the data shows us evidence of legal wrongdoing or injustice, we’ve essentially agreed not to take legal action against the District or any other party and, further, to not “encourage” anyone else to take legal action (also vague). The terms of use is so vague that this example is just one of many ways this clause might be used to stop a civic hack the District doesn’t like.

The new catalog also requires data users to give attribution to the District for some data sourced from the catalog (it is in the licensing info box for many datasets, including this one) and the terms include an indemnification clause which could make a lawsuit even more costly for civic hackers.

The mayor’s Transparency, Open Government and Open Data Directive issued in July even proposed other terms, such as a requirement to “describe any modifications made to the public dataset.” Those terms don’t appear to have made the final cut but could still be added in the future.

Terminating our projects

The new catalog will eventually replace the seven-year-old catalog at data.dc.gov, but not all of the data has been moved over to the new catalog yet. If we use any of the District’s live feeds, which includes data like crime incidents or 311 requests, we are required to agree to that catalog’s terms of use agreement, which states that the District can:

“require the termination of any and all displaying, distributing or otherwise using any or all of the RSS feeds for any reason.”

That sounds like DC gets a veto over civic apps using those feeds and can require us to shut them down.

This catalog’s terms also require data users to notify the District about how the data is being used, to attribute data to the District, and to show a disclaimer of the District’s liability on web sites built with the data. Fortunately these terms, except the attribution requirement, have been dropped from the new catalog’s terms.

And copyright infringement?

Somehow in all of this legal language the District neglected to grant anyone actual permission to use any of the data. You might think that the default is open, but it isn’t. The District probably holds a copyright in at least some aspect of these datasets. The District used to claim copyright over the law itself, but now explicitly dedicates the law to the public domain so that there is no confusion. Unfortunately, this confusion remains with respect to all of the data in DC’s data catalogs, and without explicit permission to use the data civic hackers are subject to yet another risk.

(Facts aren’t protected by copyright law, thankfully, but that might not stop the District from filing a lawsuit alleging infringement over the non-factual aspects of a dataset, the so-called “selection and arrangement” of those facts.)

This isn’t open

Giving up the right to take legal action and being required to follow extremely vague rules in order to use public data are not hallmarks of an open society. These terms are a threat that there will be a lawsuit, or even criminal prosecution, if civic hackers build apps that the District doesn’t approve of.

It has been a long-standing tenant that open government data must be license-free in order to truly be open to use by the public. If there are capricious rules around the reuse of it, it’s not open government data. Period. Code for DC noted this specifically in our comments to the mayor last year. Data subject to terms of use isn’t open.

The Mayor should update his order to direct that the city’s “open data” be made available a) without restriction and b) with an explicit dedication to the public domain.

Further reading

A review of municipal data catalogs between 2010-2013 by Philip Ashlock and Rebecca Williams found that terms like these are unfortunately common. For example, Seattle’s terms have the same right-to-require-termination clause as DC’s old catalog. The intention behind these terms may have been benign, and probably related to preventing government computers from being overloaded, but vague terms today can become specific lawsuits tomorrow — and they certainly create a chilling effect on innovators.

Code for America is a non-partisan, non-political organization working to create low-risk settings for innovation between citizens and government. Learn more about Code for America's mission and its Fellowship, Brigade, Accelerator, Peer Network and Code for All programs.