DC opens its “code”, embracing principles of open laws

This morning DC’s legal code went online as open data. I’ve worked with government before on open data, but never have I worked with a government body that moved so deftly through the technical, policy, and legal issues as the DC Council’s Office of the General Counsel. So, before anything else, thanks to the general counsel V. David Zvenyach and his staff for their time and expertise on this.

The TL;DR version goes like this:

Tom MacWright wanted to build his own version of the DC Code website. The DC Council couldn’t share its electronic copy of the Code because it contained intellectual property owned by West. This became a little and very geeky controversey (spurred by Carl Malamud). But Zvenyach — the general counsel — recognized the value of making the law open and did it. He removed the West IP from their electronic copy of the Code (I helped), posted the file on the Council’s website, and even included a CC0 public domain dedication

The last bit all happened within a matter of days, and it was one of the easiest open data success stories I’ve been a part of. Tom recapped the events here and began hacking the code immediately. He held a hacakthon on April 14 which he wrote about here (and Eric Mill wrote about here).

DC is setting an example for other jurisdictions. In terms of the 10 Principles of Law.Gov, DC’s bulk law download — achieved within only a few days of work — satisfies principles of no-charge to access (1), no copyright or terms of use (2), data in bulk (3), and, to some extent, machine processability (8).

Here’s the longer version:

This all began a few months ago when DC-based civic hacker Tom MacWright took an interest in making local law more accessible. Intending to import the DC Code into Waldo Jaquith’s State Decoded project, he ran into a small problem: he couldn’t get a complete copy of the law. Intellectual property issues prevented the DC Council from simply emailing over their copy of the Code.

Many states, like the District, contract out the codification and code-publishing work to a third-party like West (owned by the Canadian-owned Thomson Reuters) or Lexis (owned by the Amsterdam-based Reed Elsevier). DC had previously contracted out to West, and last year switched to Lexis. Neither likes to share. DC’s official website to read the Code — which has been run by West — is free to the public, but copying any part of the Code off of that website might violate West’s copyright or terms of service, or both. Sharing the law might have been illegal.

In the case here in DC, the DC Council had Word documents containing the Code, given to them by their contractor West, but the documents contained West’s logo. The DC Council could not share the documents with West’s logo intact. And it wasn’t easy to take those logos out (more on that later). Informally speaking, West owned the DC Code.

I had met Zvenyach, the general counsel, before. He is very technologically savvy and has been trying to modernize the office he took over only a few years ago. We had even talked about holding a hackathon to help him do it. (As a DC resident, I’m also interested in DC law.) But his office, like all of government, is bound by limited resources and much work to do. When Tom brought the issue onto Zvenyach’s radar, I don’t believe there was any point at which Zvenyach didn’t want to make the files available. It was, as far as I’ve observed, merely a matter of time and resources.

Tom wrote more about the intellectual property issues here and here. Coincidentally, on Monday Ed Walters of Fast Case gave a great talk on the issue of who owns the law at Reinvent the Law — I highly recommend watching it. He’s also written extensively about it.

Tom asked Carl Malamud to get involved. Carl has been working on this issue in other states, like in Oregon, where the State of Oregon itself claimed copyright over their laws. Carl bought (for quite some money) a physical copy of the DC Code, digitized it, and mailed thumb drives in the shape of famous presidents containing the digitized code to various important people. This was a spin on a tactic that Carl began in the 1990s when he opened the SEC’s corporate filings data: get the data online, pressure the government to put the data online themselves, and then help the government take over that responsibility.

The media and bloggers caught on, beginning I think with Corey Doctorow on March 27, followed by DCist on March 28, The Washington Times on March 31, Steve Schultze on April 1, and Think Progress on April 3. The files themselves went up on April 4, so little more than a week from the first media blog post about it, and the decision to put the files up with a CC0 license was made in any case some days earlier. It really did not take much pressure at all. (Tom also wrote a post on Greater Greater Washington on March 19.)

Carl had noticed early on that the DC Council asserted copyright over the Code. Some of the media reports focused on that. As Zvenyach explained in The Washington Times article, the rationale was to protect DC from West, by making sure West could not claim copyright over the same Code, not to limit access to the law. Whether or not state codes can be copyrighted was mostly besides the point, and the focus on this issue turned out to be a red herring. It was resolved quickly with the choice of the Creative Commons CC0, a public domain dedication.

I went in to Zvenyach’s office on April 3 to help them take West’s logo out of the Word documents. There was one document per title of the Code, or about 50 documents, many in the 50-megabyte size range. The West logo was in the header, but the header was specified independently for each section of the code, so in reality there were thousands of logos to take out. We also took out a DC copyright line from the documents, which was also repeated in each section. It took about 4 hours for Microsoft Word to process all of the files, and 1 hour for us to figure out how to do it so “quickly.”

When I left Zvenyach’s office that evening, Zvenyach pointed out the presidential thumb drive still sitting on his desk that he received from Carl — unfortunately I forget if it was a little George Washington or a little Abraham Lincoln. I have a feeling that thumb drive will be around for a while.

Now, there is a bigger issue here. There’s no plan for updating the public files. DC’s contract with Lexis going forward doesn’t require Lexis to provide DC with an electronic copy of the code. Perhaps after this they’ll refuse to do so. But we’ll tackle this another time.

This entry is filed under Civic Hacking.
You can follow any responses to this entry through the RSS 2.0 feed.
Both comments and pings are currently closed.

11 Responses to “DC opens its “code”, embracing principles of open laws”

I’m extremely impressed with your work, and with DC’s work. I would love nearly everything about this outcome to be used as a model.

My only quibble, and I think it’s an important one, is that this does not fully satisfy the principle of no copyright or terms of use. The DC government is releasing the code under a CC0 license, but it can do that because it is asserting the right to release it under a license of its choosing – it retains ownership, and could theoretically change its mind at any time (even if that is unlikely in practice).

We should celebrate every single action that was taken here, in and outside of government. One of the questions we should take away from this positive experience is: given that the law is inherently public domain, what must we do so that DC does not feel the need to assert copyright of any sort, even the CC0 variety, to protect us from third party copyright?

I don’t think your first point is correct. CC0 is equivalent to “I’m not saying we necessarily have copyright, but if we do, we disown it.” CC0 is different from the Creative Commons licenses. It’s not actually a license, it’s a public domain dedication.

If instead of CC0 the DC Council said nothing, there would be uncertainty about whether DC held any copyright in the document (except for works of the federal government, one can never really be sure). So CC0 is preferable because it is explicit.

(Like a copyright license, CC0 can’t be revoked once it’s applied. Of course, this doesn’t apply to future updates of the Code but no one can really disown copyright for things that have yet to be created.)

While I see what you’re getting at, CC0 is still a tool for releasing copyright that you hold. For identifying content that is already in the public domain (as law is), Creative Commons recommends the Public Domain Mark:

I recognize that CC0 does some additional work to preserve public domain status in countries that don’t support it, but I do not believe that is relevant to the laws of the US.

Decisions on future updates of the Code is what I meant when I said that they could theoretically change their mind. By publicly acknowledging that the work is *already* public domain (whether it’s with that CC mark or just some words), that is not something they can take back – at least, not without having their own words contested in court.

And they’d be doing just as valuable a public service against third party licensing by putting the weight of the DC government behind the principle that laws are inherently public domain.

But these are the breaks. The PD mark didn’t occur to me, but everyone involved thought CC0 was appropriate. I don’t think we would have gotten this far if Carl and Tom insisted on the GC making a policy statement about whether DC laws are born copyrighted or not.