Learning to Swim in the Data Deluge

Main menu

Post navigation

Data and the Freedom of Information Act

In reading one of the many blogs that I read, there was a suggestion to use the Baltimore’s parking citation data to see if some makes/models of cars get citations more than others. Now parking citations are very near and dear to me since I get at least one (n ≥ 1) parking citation a year parking near the University of Minnesota–which most often also leads to my car being towed since you only have so many hours to move your car after they ticket it.

One problem. Minneapolis does not make their data as openly accessible as Baltimore. My initial thought was “no worries”. The Freedom of Information Act and the Minnesota Government Data Practices Act (Minnesota Statutes, Chapter 13) make it clear that all government data is public unless a state or federal laws says the data is not public. Furthermore, according to the Hennepin County Requests for Data by Members of the Public webpage, these laws “stipulate that Hennepin County must keep all government data in a way that makes it easy for you, as a member of the public, to access.”

My quest to obtain the parking citation data for Hennepin County (the county Minneapolis resides in) began at roughly 7:00 a.m. when I went to the Minneapolis Police Department webpage to find out how to obtain this data. After following the link to “traffic violations” I was redirected to the Hennepin County Traffic Violations Bureau, which is actually the Fourth District of the Minnesota Judicial Branch’s website. (This will matter later in the story). After making sure that I hadn’t been accidentally transported to the internet of 1997 (the page had echoes of the old Yahoo), I emailed the contact person under data requests.

At this point it was 7:30 a.m. and I thought great, I will be answering all my questions with ggplot2 by noon! After some back and forth with the contact person (I had to more specifically state what I wanted and make sure that they understood I wanted the raw data not summary reports, them making sure that Excel was an ok format for me) I received the following email:

The fee for running the data request is $60. If you approve, the next step is to prepay by check. District Court is unable to process cash or credit card payments for data requests.

Sixty dollars!!?? Are you kidding me? After my initial shock, I sent a bevy of emails back and forth with the contact person quoting both the Minnesota and Federal laws that they had printed on their website. The response was:

District Court is subject to the Rules of Public Access to the Records of the Judicial Branch http://www.mncourts.gov/?page=511#publicAccess. The Rules do not require raw data to be accessible through the internet. The data you are interested in must be extracted from the database to put is a readable Excel file, thus the prorated fee of $60 (rate is $80/hour).

To which I asked why it would cost me $60 for someone to query a database and output the results into an Excel spreadsheet (I am in the wrong profession!). When I formally withdrew my request at 2:03 p.m., I also politely mentioned that I would be blogging about this adventure. I was asked to call a phone number and talk with them before I blogged about it.

I called and we actually had a pretty good talk. Here is where things are a bit fuzzy and the difference between Hennepin County and the Fourth District Court come into play. I was told that these two entities have completely separate rules regarding the accessibility of data. (Note to the reader: This is true, but as far as I can tell by the document linked in the second email, these rules would not apply to parking citation data.) Also, because they are not on the same budget different requests for data cost the public money. (Another note to the reader: This is where I was told that since there were multiple queries to run–at least 21 different rules and regulations governing parking in Minneapolis–it would cost a lot of money because they had to go through quite a menu-structure for each query.)

I have the name of a contact who may or may not be able to help me obtain this data, and I will keep you posted. But all of this is to point out that “publicly accessible” data not only varies in its accessibility, but also in its being “public”. In Hennepin County, you still need to be somewhat affluent to obtain these data.

I applaud the cities that have begun and carried through with open data initiatives. Unfortunately, the United Kingdom is way ahead of the United States when it comes to open data. Here are some examples of cities that have embraced open data.

I am sure there are many more cities that have opened up their data in manners that are truly “open”. If people have good suggestions about how we as a statistics community can be more instrumental in helping city and county governments embrace these initiatives, post in the comments. I would love to hear from you.

You might find MuckRock interesting and helpful: https://www.muckrock.com/
They’re a startup in Boston dedicated to helping citizens navigate the FOIA process and to making the data that is released publicly available and usable. They have a pretty great model, and I’ve found the guys who started the organization to be very helpful and eager to work with those that reach out to them.

[…] I called and we actually had a pretty good talk. Here is where things are a bit fuzzy and the difference between Hennepin County and the Fourth District Court come into play. I was told that these two entities have completely separate rules regarding the accessibility of data. (Note to the reader: This is true, but as far as I can tell by the document linked in the second email, these rules would not apply to parking citation data.) Also, because they are not on the same budget different requests for data cost the public money. (Another note to the reader: This is where I was told that since there were multiple queries to run–at least 21 different rules and regulations governing parking in Minneapolis–it would cost a lot of money because they had to go through quite a menu-structure for each query.)Source: citizen-statistician.org […]

Actually $60 is quite reasonable. I request a lot of public data and most cost estimates are well over $100, sometimes even over $500. (But with a little prodding I can prove to them that they are overcharging me and violating the Data Practices Act). And yes, it’s true that any data that is held by the district court is not subject to the Data Practices Act. In this case, it’s a bit confusing because Minneapolis police write the tickets, but then they submit the information to the court system and it’s the court system that maintains the database. I certainly wish the whole system of getting data was much easier — and not cost anything. Hopefully we will move in that direction soon.