Over the last few weeks, I’ve been checking out my about:profile page [addons.mozilla.org], and I’ve been pretty surprised at how accurate it can get even though it’s a simple proof of concept to initiate discussions on how Mozilla should be analyzing data in the Firefox [blog.mozilla.org].

Overall categorization and detailed/recent interests

It shares some ideas with what Margaret implemented for about:me [wiki.mozilla.org] such as processing the local data within Firefox and not sending data out of Firefox, except in about:profile, we’re trying to generate higher-level concepts such as an interest category as opposed to statistics of your browser behavior. We happened to go with some readily available domain data of ODP categories and Alexa siteinfo, and we selected some hundreds of top sites to package into the add-on. So while the reference data is not an exhaustive list, it seems to work for quite a few people I’ve shown the add-on to.

Our somewhat arbitrary choices of category interests and site demographics got me thinking about what we could do with this data in Firefox, and I seem to keep coming back to this distinction of category data actually shows what I’m interested in whereas demographics appears to create a label/characteristic that opens things up to preconceived judgements. I suppose in other words, the former is based on something I did vs the latter is something I am. (Although technically, the about:profile experiment is trying to guess at who you are based on what you did.)

I’m sure others will be able to better describe the differences between the two, but I wonder if because there appears to be a fundamental difference, we should go about presenting the data differently to the user. For example, perhaps users will be happy to explicitly give Firefox one’s demographic data whereas trying to have the user create a list of interested categories might be overwhelming.

I’m excited that we’ve released the add-on to get a conversation started because there’s so many different ways to analyze the data in Firefox, and each method can lead to interesting discussions such as this one about categories vs demographics.