Privacy, big data and education: more about the inBloom databases

A new national database of personal student information understandably has parents and privacy advocates alarmed. As reported elsewhere, the new inBloom database houses information on millions of school children from nine states and includes names, addresses, telephone numbers, disciplinary records and learning disabilities.

One of the states is New York. Naturally, the mommy listservs in Brooklyn, where I live, are going wild with “opt-out” letters. My first reaction was surprise. Could it really be true that inBloom was going to release this private information to any ap developer who asked? (Disclosure: inBloom, a non-profit organization, is funded by the Gates Foundation and the Carnegie Corporation of New York, which are also among the funders of The Hechinger Report).

inBloom explained to me that there are two separate data stores. One is the real data that belongs to the states and school districts. The other is a sandbox of fake data for developers.

With the real data, inBloom is functioning like an off-site storage service for school districts. inBloom says it will store any type of data that states and districts feel is relevant to their educational purposes. That’s why social security numbers could be in one of the fields. But inBloom spokesperson Barbara Roos says inBloom is not aware of any school district using social security numbers. Roos added that the regional data will not be mingled into a single national database; each state’s and district’s data will be maintained separately.

The sandbox contains only fake student data that developers can use for testing their new products. Sharren Bates, inBloom’s Chief Product Officer, emailed me that her development team generated the sample from scratch to be representative but not at all connected to any real student data set. …Some of the data was machine-generated and some was generated by hand.”

The hope here is that new applications will be able to plug into a school’s existing computer system. That would save schools hefty integration costs every time they buy a new piece of software.

But third parties can get access to the real data. That’s when a school district has directly hired a company to be an application provider of, say, a math program. The school district can authorize inBloom to release data to the vendor. inBloom is, in effect, a middle man. The school district could have just given the math company the data directly. Before inBloom’s creation, that’s exactly what happened.

Generally, there’s a contract between the school district and the vendor, defining exactly which data the vendor can have access to. The district could choose to include students’ arrest records, but that would be unlikely with a math program. inBloom says it has set up a “granular” system so that a district can specify exactly which data it wants and doesn’t want to go to a third party.

Of course, there’s plenty to be worried about. What if a school district bureaucrat makes a mistake and accidentally releases data to vendors that he shouldn’t have? What if a software company fails to protect this sensitive information? What happens if personally identifiable information is transmitted to a vendor that goes out of business? And how secure are these clouds, where the data is stored?

“Some people don’t want information about their children used or documented at all. Certainly they have the right to that opinion, but that’s a larger issue,” said inBloom’s Roos. “Recording information about students, storing it in the cloud and sharing it with vendors has been happening for a while. inBloom hasn’t started that. People need to be having that conversation with their school and the government.”