Media tracking and the quantified self

Gary Wolf and Kevin Kelly have been documenting an emerging phenomenon they call “the quantified self“. The term refers to a set experiments that people are conducting – primarily on themselves – to understand their own bodies and behavior. In an article for the New York Times magazine, Wolf details a range of these experiments. One engineer weans himself off coffee and compares his reported levels of concentration with and without caffiene. Others use sensors like the Zeo to track their sleep patterns, or the Fitbit to track physical activity. Some track what they eat and drink, how much they weigh, their emotional states.

Wolf acknowledges that some of the people profiled in the article sound obsessive and notes that people engaged in detailed self-tracking may be “outliers”. And he’s careful to offer testimonies from people who engaged in self-tracking and gave it up, feeling like the data they generated was relentless and remorseless. (As someone who’s had to engage in self-tracking of blood glucose levels as a type 1 diabetic for the past 25 years, “relentless and remorseless” are my words, not Wolf’s.) But he’s clearly a believer that tracking can be a tool for self-discovery, a way of learning what constitutes normal behavior for each of us, not just a tool for moving towards a goal, like increased fitness or better sleep.

The experiment in self-tracking that I’m considering is more about self knowledge than self improvement, though I’m finding it’s hard to separate the two. I’m looking for ways to monitor my personal information flow. I’d like to understand how I get information about the world – through television, the web, radio, email and the people I talk to. The hope is to use myself as a guinea pig, to see what’s possible as far as active and passive monitoring of information flows, in the hope of opening the experiment to a wider population.

I’ve made the case – in my recent TED talk and elsewhere – that many of us overestimate the amount of diverse, international information we encounter through the internet and other communications networks. We run the danger of being “imaginary cosmopolitans”, convinced we’re encountering information from all corners of the world, while we might be trapped in homogenous echo chambers.

There’s some data to support this theory, both from experiments colleagues and I have been carrying out looking at cosmopolitan and parochial consumption of media online, and there are terrific analyses like Pippa Norris and Ronald Inglehart’s “Cosmopolitan Communications“, which looks at vast data sets about communication flows across borders. But I’ve not been able to find much information that compares the media diets of individuals at a level that allows me to answer questions like, “What percentage of news encountered is local, national and international, on average? What media is most likely to make an individual seek out more information – a mainstream media story, a citizen media post or a personal recommendation?

Responding to an earlier blog post of mine, my friend David Sasaki proposed an experiment: keep a communications diary that tracked interactions via different media. If I’m going to argue that people’s uses of the Internet are disproportionately domestic, it would be good to compare those online interactions to other media. Sure, 90% of my Facebook interactions might be domestic, but perhaps that vastly outpaces my face to face interactions – that might then be an argument that the Internet is, on balance, more likely to help us interact with people from different nations (different religions, different political perspectives, take your pick) than other technologies.

Media diaries aren’t new – take an intro communications class at many universities, and you’re likely to be asked to keep one. They tend to be pretty superficial – it requires some serious obsessiveness to log the individual stories you encounter, rather than writing down “NPR – 7am – 7:20am. And the process of keeping a diary tends to shape your behavior – for the month Rachel and I were a Nielsen family (years back), we watched vastly more public television than we do in an average month.

It’s easier than ever to keep a diary with tools like Your Flowing Data, a Twitter-based service that allows you to send direct messages via the web or SMS. I just logged “d yfd listened WNNZ 0750 – 0830″, a syntax that I hope will let me start collecting information on what media I encounter offline, and who I interact with in the real world.

But what I really want is data on the dozen or more stories I heard on NPR during that morning drive – coding each in terms of subject and geography would mean either logging while driving or writing a tool that turns the name of a broadcast media source and an interval into a stream of metadata. (To a certain extent, this is one of the functions of MediaCloud, but we’re a long way from being able to do this with media that isn’t also creating RSS and Atom feeds.) Furthermore, I know that the process of logging my behavior will influence that behavior. I can already see myself tweeting “d yfd watched football 1130 – 2045″ on Sunday and the accompanying feelings of guilt, shame and, if the Packers lose, frustration.

Logging my media diet is clearly going to involve some diary work, but it would be great if I could automate collection as much as possible, both to minimize the time requirements and the influence logging will have on my behavior. And, if this is an experiment I hope others will repeat, logging needs to be as automatic as possible. So I’ve been looking for tools that will log and analyze my online behavior transparently.

My friend and colleague Judith Donath was responsible for a number of early tools designed to allow self-monitoring of email use, including Themail, developed with her student (and now, world-leading designer) Fernanda Viegas, Themail. I asked her advice on locating appropriate self-tracking tools to understand how I’m getting information through email, the web, Twitter and other media. Her suggestion: look at productivity tools.

Good advice. Most of the tools I’d been finding to track web use either are designed to allow bosses to monitor their workers or spouses to read each others’ email. Judith’s advice led me to Rescue Time, an amazing package that monitors everything you do on your computer… and nags you when it perceives you to be wasting time. I may break down and turn off the messages that urge me to account for every five minutes of inactivity, but I’m finding the ability to track what applications I’m using to be hugely helpful, if slightly dispiriting.

A week in the life with RescueTime

Apparently, I spent roughly twice as much time answering email as I do anything else. Writing (BBedit) comes in second place, though I apparently spend almost half as much time on Twitter as I do writing. And while I’d likely tell you I get most of my important news from Global Voices, the New York Times and Foreign Policy’s Passport, the logs tell the tale of my secret shame: a need to view every single goofy image posted on Reddit.

While RescueTime does a lovely job of presenting this information, I find myself looking longingly at the data collected by Eyebrowse, a project from Brennan Moore, Max Van Kleek and David Karger at MIT computer science. Eyebrowse is a Firefox plugin that grabs URL information on every page you browse and offers the option to report the data linked to a profile or a set of demographics. The profiles are far more revealing than I’d personally be comfortable with – it’s one thing to know that I’m a sucker for Metafilter, another to see every Javascript function call Karger looks up. My dream tool grabs the data Eyebrowse does, analyzes it and presents it at the level of granularity RescueTime offers.

Mail Trends visualizes the top ten people I receive email from

That’s not an easy balance to strike. I’ve been looking at mail analysis tools as well, since Themail no longer exists. Mail Trends gets the job done… if you’re a GMail user and if you don’t mind mucking about on the command line. (It’s a very elegant Python script, which needs the Cheetah templating library. In my experience, it chokes when I try to feed it more than 100,000 emails, but works like a champ on 50,000 or so.) Mail Trends does a great job in offering a topline summary – I now know that my primary research collaborator sends me roughly twice as much email as my wife… which may or may not tell me something helpful about both relationships. But it’s not able to tell me what URLs I follow within emails and which I ignore, data that I’d need to understand how I get information from mailing lists and individuals. My guess is that a tool specific enough to track the URLs I read would be almost unusable in terms of showing the overall patterns of email usage.

A map of who I follow on Twitter, via MMMeeja

I’m having similar problems figuring out how to analyze Twitter. Tweetstats offers insights on who I retweet and who I reply to – good indicators of who I read closely within the set of 585 people I follow. And MMMeeja provides a pretty map of those 585 who have provided information about their location, letting me see that I follow a lot of Africans and not many South Americans. Again, what I’d really like is something that collected every URL presented to me via Twitter and tracked which ones I follow and which I ignore. Ditto for Facebook, though I use it lots less.

So – here are my open questions:

– What are tools I’ve not yet found that solve some of the problems I’ve described here? Is there a good tool that can turn an interval of radio or television into a stream of story metadata? Has anyone developed a tool that tracks every URL I encounter across applications and examines whether I’ve followed it?

– Has anyone come up with a way to make offline media tracking easier to do? Something like Shazam, which could listen to radio or television with me and tell me what stories I’m hearing? A microformat for tracking conversations with individuals?

– If I want anyone else to participate in this project – and I do -what’s the right balance between the overbroad and the spookily specific? If I’m not willing to start using Eyebrowse, what level of specificity is the right one? Your top eight sites, as Chrome present to you? The aggregate data of RescueTime? A world map that shows how often different corners of the world are presented to you in the course of a month?

– To make the process of media self-monitoring worth engaging in, there needs to be a reward, either in terms of self-knowledge or self-improvement. What sorts of knowledge would make you willing to participate in an experiment like this? Are there behaviors that you’d like to change that such an experiment would help you identify and address? Or have I simply descended too deep into the realm of the obsessive outlier?

It’s interesting – your concern (from a researcher’s perspective) that using a tool like Rescuetime to monitor your media diet will influence what media you consume is the exact reason that I so appreciate Rescuetime. Those weekly reminders of how I spend my time make me want to be a more savvy media consumer, to better prioritize what I decide to read/view/listen to.

It is a habit that I recommend to everyone, but I am fairly certain that you and I, as media diet trackers, are absolute outliers. Even though the entire process is automated, most friends and colleagues tell me that they are to busy to track their media consumption. Others have privacy concerns.

That doesn’t take away from the value of the experiment you propose. In fact, I think that it would be extremely fruitful if you were to develop and teach a course on cosmopolitan media at Harvard and require your students to keep a media diary. I can imagine Pippa Norris and Chris Lydon getting involved.

This post strikes me as an exhaustive list of media tracking tools — and it shows how far these tools have come over the past few years.

There are two other aspects that I think are important to integrate into the design of the research. First, why do people read content that does not relate to their immediate lives? For example, you seem to be reading a lot about Somalia these days – why? What is the incentive? Where does your interest come from?

The second aspect is related, but at the opposite end of the process — what is the impact/result of reading content that does not relate to your immediate life? I know that you and I have both received questions following presentations of Global Voices that basically ask, “well, why should I care about Fiji or Uzbekistan?” This goes all the way back to 2004 when you and Joi were talking about “the caring problem.” But the question is as difficult to answer today as it was then. Indeed, why should I care about Uzbekistan? One argument I’ve made is, because the cotton from my H&M shirt comes from there. But I’m hardly convinced myself by that argument. Another argument that I think that we’ve both used is because “local solutions to global problems require global awareness.”

That sounds right to me, but I wonder if we can actually draw a link between a more cosmopolitan media diet and better solutions to global problems such as water usage, climate change, migrant social inclusion, etc. In other words, what are the outcomes/outputs of a cosmopolitan media diet? You read about Somalia, but what does that do for Somalia?, what does that do for you?, and what are the unintended consequences? If you read about Somalia are you more likely to get involved in campaigns related to Somalia? Do those campaigns have an impact? Do you become a better neighbor toward Somalian immigrants in your region? Do you become a more conscientious consumer toward products that come from Somalia? Etc.

I’m very much looking forward to what you find. And if you’re able to enlist more volunteers in your research then you might also be able to take an early stab at questions like:

Are multilingual or monolingual individuals more cosmopolitan consumers of news? Men or women? Youth or adults? Old media or new media consumers? Etc.

On a final note, what you find might also reveal some of the limitations of Media Cloud as a tool to track the spread of news memes. It might turn out that much of the communication of news memes takes place not in public spaces online where Media Cloud can crawl RSS feeds, but rather behind the closed doors of email, Facebook, and other private recommendation systems.

Kevin, thanks for pointing to Voyage – looks like another cool Firefox plugin that I should try. I’ve moved to Chrome – much stabler and faster on my Mac – so I think Rescue Time is going to be my poison of choice for the moment, but I’m glad to know of it for the sake of creating a better catalog of these monitoring tools.

Fernando, thanks for the excellent WSJ article. To summarize, for those not following the link: several major advertisers and broadcast media companies are sponsoring a million-dollar study that involves handing iPhones out to 1000 people and asking them to do half-hour media diaries. Seems pretty old-skool to me – the intriguing suggestion at the end is that companies like Tivo have lots better data than one might collect through this method.

David, thanks for connecting this to the broader issues. The answers I have to the questions about media interest and attention are the lame ones I’ve been giving for years:
– the broader your view of the world, the more likely you are to see business opportunities in other nations
– the broader the view, the more likely you see threats emerging elsewhere, and can, perhaps, react to them
– the broader your view of the world, the more inspiration, creativity, perspective you have access to
– It’s incumbent upon all of us to witness injustice and tragedy and try to prevent it, wherever it occurs.

My answer to the Somalia question falls under factors 2 and 4 – I wouldn’t be surprised if the US invaded Somalia in the next couple of years, and I’d like to be out ahead of that issue so I can advocate with my representatives in Congress and friends in the State department against such involvement. My obsessive engagement with West Africa tends to be more in camps #1 and #3 – I think someone’s going to make a ton of money off the rise of Ghana, Senegal and Cameroon… and perhaps Nigeria some day.

While I believe these four factors are true – and help me explain phenomenon like the fact that the Wall Street Journal now has more foreign correspondents than the rest of US newspapers combined – they’re not especially satisfying to me. I’d like to crack that nut by demonstrating that a cosmopolitan perspective is, in some ways, correlated to economic or creative success, or perhaps to some sense of civic engagement and purpose… or to a broader definition of happiness. Not sure how to even structure that research, but I’m increasingly feeling it’s something I might need to strengthen the case I’m making.

I’ve been on a media diet for years, and it’s been extremely helpful. If you want to actively go on one (rather than collect data then decide action), here’s the experiment:

> Eliminate from your life all news that’s not directly related to work, including papers, magazines, radio, and on-line sources. At the end observe:

o Was there anything important that you didn’t hear about from a friend?
o How did your mood change during this?
o How much extra work did you get done?

After a few weeks I realized that not much of import actually happens (trivia and excitement rule) and that I was much calmer. Put another way, news is rarely important or durable. I took a hint from Tim Ferriss and catch up on news by scanning the newspapers on sale on the sidewalk as I pass. Usually gets me up-to-date in about 20 seconds.

My wife is doing this and had a funny comment: “I almost didn’t know about the hurricane that didn’t happen.” Exactly!

Ethan, we’d love to see the kind of Eyebrowse visualization you’re talking about. The complete Eyebrowse log is maintained at http://eyebrowse.csail.mit.edu/data/all_the_data.zip and can be downloaded by anyone who has a good visualization idea. Right now we’ve only got about half a million log entries because we haven’t yet built up the critical mass of log usage, but I think a few visualization tools like the one you describe could convince people it’s worth participating in the communal logging experiment.

You raise a question of comfort with sharing log data. Remember, Eyebrowse shows only the sites a user decides to make public. I’ve personally decided that letting people see my Javascript explorations is safe (they may reveal me to be a pretty bad programmer, but I don’t consider that embarrassing). But you can make a different decision. I believe that with enough users we’ll get an aggregate that is informative regardless of individual sharing choices.

Matthew, thanks for that provocative idea. My interest here isn’t really to increase my own happiness and productivity – it’s to figure out whether the sweeping statements I’m making about people’s engagement with media are true or not. I think the model you’re putting forward bears a lot of resemblance to the way many people are now encountering news media. I think you and I likely differ sharply on the utility – social and otherwise – of engaging with news, but I appreciate the idea and look forward to thinking through it.

David, thanks for your comment and for the link to the data – downloading it now, and looking forward to playing with it. I don’t know that I have brilliant ideas for visualizing it, but I am very interested in figuring out whether I can use the data to help answer some of my questions of how social media leads us to encounter news media. Your clarification on sharing is an excellent one – sorry if I mischaracterized the tool based on the demo of your account I looked at.

All of the methods would tell me (I hope) what my knowledge universe is (what I implicitly or explicitly know) but not what I do not know. It is not a problem limited to just individuals. It can be extended to other entities and processes. Will publicizing my knowledge universe in a cloud connect to knowledge universes of others (not necessarily your immediate social network but weakly linked ones) to fill my knowledge gaps? If so I would have a real knowledge garden.