I’ve received a couple of emails from people saying it’s hard to comment on the data issue without some idea of where I’m heading or what I’m thinking. So here goes. I’ll come back to some of the topics I’ve written about already. And I’ll continue with the other posts as well; I think we need some depth of analysis to make good decisions.

In the meantime, here’s the basic message.

I would like to see Mozilla provide more leadership in helping people manage the collection and treatment of data related to them — what I’ve called “Associated Data.” I don’t have a specific plan of what leadership would look like, or what features or capabilities this means our products, services or websites should implement (or block). There are a lot of different types of Associated Data; the desired treatment of different types may vary. This is something I’d like to see us figure out.

I would also like to see Mozilla provide leadership in treating some basic aggregate, anonymized usage data as a public asset. To do this, we need to develop a sense of what data this might include and what aggregation and anonymizing techniques make the Mozilla communities comfortable. Some data — like public disclosure of bandwidth use, website rankings, etc. seem to be areas everyone is comfortable with, but we should make as few assumptions as possible. Sometimes it can be hard to get truly anonymous data and so this is an area where great care — and therefore leadership — is required. But if everything that is known about the basic usage of the Internet is closed and proprietary then the Internet as an open platform will suffer. I don’t have a specific plan as to what Mozilla might do here; that’s the point of the discussion.

These are difficult and sensitive topics, it would be easier to ignore them. But both of these areas are critical to building the Internet that is healthy for the individuals using it. The Mozilla mission is to keep the Internet an open platform, and to promote the values in the Mozilla Manifesto. It will be hard to do this if we ignore the effects of data.

4 comments for “Data — getting to the point”

Why "Associated data" is important, and what should Mozilla do about it…

Like Mitchell says, it’s a sensitive topic, and I think that Mozilla has potentially a unique perspective on this important issue. We should not be shy… Let’s not avoid having this important discussion. Jump to Mitchell’s blog, read her whole arti…

Good set of posts on data, and no question that this is a critical issue for the next few years. In addition to asking ‘what should we do?’, it’s also worth asking ‘how should we do it?’ Some data issues (personal data portability?) may be best addressed with products, services and standards. With others, the best strategy may be to gather others in the tech industry around a common agenda (making high level / anonymized / aggregate usage data open). There is a chance for Mozilla to take some leadership on both the ‘creating concrete things’ and ‘organizing the industry’ fronts.