Where librarians and the internet meet: internet searching, Web 2.0 resources, search engines and their development. These are my personal views and not those of CILIP or any other organisation I may be associated with.

January 13, 2014

Immersion of email and big data

Just how much can we tell about someone just using very basic data? I have been playing around with a tool from MIT called Immersion, which looks at very basic information about your emails. I found the tool via an article called 'Think Metadata isn't intrusive?' which if you haven't read it, I would suggest doing so. Basically you give the tool access to your email account metadata—not the content, just the time and date stamps, and “To” and “Cc” fields and it numbercrunches for a very long time (it took 15 hours for it to do mine) before giving you back a series of graphs.

This is the type of information that you get - you can save graphs with or without names, and I've chosen the latter for obvious privacy reasons:

Each circle is a person - and the larger the circle, the more email goes back and forth between you and them. The colours represent groupings; the orange group for example is people closely related to or at CILIP for example. This collection is a total figure of 9.6 years, 367 collaborators and 30,000 emails. Now, if I go back to 2009, before I started working more closely with CILIP, the pattern is very different:

It doesn't take a genius to work out that my work and social patterns have changed dramatically over the three year period - the colours relate to different grouping this time by the way, so the orange now relates to another set of people that I was working with very closely at that period of time.

I can also get other basic details:

and again it's not too difficult to work out some basic information about what I have been doing simply based on the bare minimum details. If I flick across to collaborators it's again very easy to make assumptions about what I'm doing at any given period of time. Obviously it's far easier for me, since we're looking at data about me, so it's an open book, but even if you know very little about me, you can quickly start making assumptions, and they're not going to be too far from the mark, even if you have no idea of the content of my emails.

The idea that the government doesn't want to access the content of your email, but simply have basic details may not - at first glance - be a cause for that much concern. However, it really IS a big deal, and I'd encourage you to consider experimenting yourself. You may of course have concerns about letting MIT have access to these basic details - but then, that's what the UK government want. Think about it.

TrackBack

Comments

You can follow this conversation by subscribing to the comment feed for this post.

Am I the only person who feels that technology in its infancy was dynamic and something we all couldn't quite finish learning about, but now however, feels like something out of an aged film creeping ever closer, circling playing havoc with our lives?
Yes this is interesting and very slightly 'It's getting closer - run!!'
Thanks Phil. I think It's time to run your own detective agency!!

I tried this at the weekend but it only analysed by gmail address which touched a tiny fraction of my emailing activity since Prestel days. The tool could be useful a useful way to demo the power of metadata to the unconvinced.