Share this story

If you've ever had trouble explaining the concept of "big data" to someone—or had trouble wrapping your brain around the buzzword yourself—Rick Smolan wants to help. But how do you demonstrate big data? It's the somewhat abstract, powerful analytical processing of massive quantities of human- and machine-generated data. Smolan believes he can help people understand the idea with his new multimedia effort "The Human Face of Big Data," a week-long, mobile-app project which seeks to transform millions of smartphone users into voluntary sensors who will then "measure the world."

Smolan is a former National Geographic photojournalist, and his company, Against All Odds Productions, is a driving force behind the project. He may be uniquely qualified for this task due to his long track record of tackling monstrously large topics with large teams. In March 1981, he self-published A Day in the Life of Australia, a 24-hour photographic portrait of Australia by 100 photojournalists. In February of 1996, Smolan took on the Internet with 24 Hours in Cyberspace. The book, Web, and CD-ROM project (funded largely by tech companies) combined the participation of people across the Internet with the efforts of Web designers, editors, software developers, and 150 photojournalists.

Since then, Smolan has continued tackling weighty topics (for instance the global water crisis). But he told Ars the phenomenon of big data was "the most challenging and most exciting" topic he has investigated. "Big data is going to have an even greater impact than the Internet itself," he said, in terms of its potential impact on people's lives.

Part awareness-raising project, part publicity stunt, the app is tied to a large-format book and iPad interactive e-book (due in November). The app itself will be available for download starting September 26, and will "allow people around the world to compare and share their lives over the course of a week." Ideally, Smolan said, it will expose participants to how much information they are already sharing passively through their phones and other devices—information that companies are profiting from.

Most people are unaware of how much information is being collected from them and how it's used, Smolan said. "I was totally unaware I was transmitting data all the time from my phone. And nothing suggested that when I was searching for things on Google, I was putting data back in as well. If you asked someone 10 years ago, 'Would you let them plant a tracking device on you to track you everywhere you go?' You'd say 'no way'—but today, we're doing it voluntarily without even thinking about it."

The application will allow iOS and Android smartphone users to participate actively by answering survey questions and passively through data collected from the phone's GPS and other sensors. "We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said, "We can show that people in Sydney travel, on average, a certain number of miles a day."

The resulting data set will be able to be filtered by a variety of demographic settings, and will be analyzed by a collection of data scientists at the project's "Mission Control." "The Human Face of Big Data" will then conclude with an invitation-only event in New York City on October 2. Smolan said the data will ultimately be made available publicly and donated to researchers.

The book and e-book will be a more familiar project for Smolan. Containing photos and stories collected by over 100 photojournalists in 30 countries, they highlight individuals who have made discoveries by analyzing large quantities of data, as well as people with tales of the dark side of the technology. The anecdotes even include a bit on the "ring of steel"—the surveillance system with over 3,000 cameras used around lower Manhattan—and information on surveillance systems in London that inspired Jonathan Nolan's development of Person of Interest.

There's also a comparison between the faces of the pre-Internet world of more secretive, controlled "big data" and the post-Wikileaks world. "We found a photograph of the old FBI filing system from the 50s," Smolan said, "Then we found a picture of Julian Assange's (Wikileaks) data center in Sweden." Smolan says they look remarkably similar, but "the FBI filing system was much more secure back when you couldn't walk out with it on a thumbdrive."

Perhaps the final touch of the project's journalism-meets-stunt nature is how the book will be initially distributed. In an effort to maximize its impact, Smolan says it will be delivered to a list of the "10,000 most influential people around the world" on November 20. The whole project was underwritten by EMC (though Smolan says that the company has "never seen the book or the app" and has had no editorial input). To further counteract any corporate influence, the project will donate a dollar from each app download (for the first 50,000 downloads) to charity:water, a nonprofit funding safe drinking water projects in developing countries.

Share this story

Sean Gallagher
Sean is Ars Technica's IT and National Security Editor. A former Navy officer, systems administrator, and network systems integrator with 20 years of IT journalism experience, he lives and works in Baltimore, Maryland. Emailsean.gallagher@arstechnica.com//Twitter@thepacketrat

20 Reader Comments

Compare answers about yourself, your family, trust, sleep, sex, dating and dreams with millions of others around the world. Map your daily footprint, share what brings you luck, and get a glimpse into the one thing people want to experience during their lifetime.

No thanks. I don't even bother with social media sites which some of this information can be leached from, much less with an app in which very personal information is gathered, including geolocation habits.

Compare answers about yourself, your family, trust, sleep, sex, dating and dreams with millions of others around the world. Map your daily footprint, share what brings you luck, and get a glimpse into the one thing people want to experience during their lifetime.

No thanks. I don't even bother with social media sites which some of this information can be leached from, much less with an app in which very personal information is gathered, including geolocation habits.

Part of the reason I find this project compelling is highlighted by that comment - odds are good (though not certain, of course) that you are already broadcasting this information; you're just not aware of it. Highlighting the information we're all already sharing with or without our consciously deciding to is, in my view, a useful endeavor.

Though the project will be skewed, of course, since people like yourself will not participate, while people who deliberately broadcast this information will. This could undermine its impact; I can see people writing off the results as specific to the voluntary-sharing demographic, rather than as intrinsic to much of modern life.

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

Is the idea to behave like you normally would, or to deliberately maximize the amount of data you're sharing?

Because I, for example, leave GPS and WiFi off (or at least I touch icons and make them SAY they're turned off) unless I'm actually using them for something, and I don't use Facebook or search with Google.

Does he want to include outliers like me, or is he more interested in showing worst-case, featuring users who share as much information as possible?

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

It's already out there. If I recall correctly this is exactly how venues such as google maps displays traffic updates. Users phones (anonymously?) upload GPS location which pinpoints which road they are on along with velocity information. If a particular stretch of road is rated for 65MPH and phones are reporting only 30 MPH that would indicate congestion. A bunch of phones just barely creeping along and you might have a wreck. Add a Z axis to the maps normal XY coordinate system and you'd have Dukes of Hazard style traffic information, which might be kinda cool now that I think about it.

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

It's already out there. If I recall correctly this is exactly how venues such as google maps displays traffic updates. Users phones (anonymously?) upload GPS location which pinpoints which road they are on along with velocity information. If a particular stretch of road is rated for 65MPH and phones are reporting only 30 MPH that would indicate congestion. A bunch of phones not creeping along and you might have a wreck. Add a Z axis to the maps normal XY coordinate system and you'd have Dukes of Hazard style traffic information, which might be kinda cool now that I think about it.

Compare answers about yourself, your family, trust, sleep, sex, dating and dreams with millions of others around the world. Map your daily footprint, share what brings you luck, and get a glimpse into the one thing people want to experience during their lifetime.

No thanks. I don't even bother with social media sites which some of this information can be leached from, much less with an app in which very personal information is gathered, including geolocation habits.

Part of the reason I find this project compelling is highlighted by that comment - odds are good (though not certain, of course) that you are already broadcasting this information; you're just not aware of it. Highlighting the information we're all already sharing with or without our consciously deciding to is, in my view, a useful endeavor.

Sigh... I'm aware of it, I'm not keen about it, but I try to minimize data leaks where I can. (I added bold for clairity)

Control Group wrote:

Though the project will be skewed, of course, since people like yourself will not participate, while people who deliberately broadcast this information will. This could undermine its impact; I can see people writing off the results as specific to the voluntary-sharing demographic, rather than as intrinsic to much of modern life.

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

It's already out there. If I recall correctly this is exactly how venues such as google maps displays traffic updates. Users phones (anonymously?) upload GPS location which pinpoints which road they are on along with velocity information. If a particular stretch of road is rated for 65MPH and phones are reporting only 30 MPH that would indicate congestion. A bunch of phones not creeping along and you might have a wreck. Add a Z axis to the maps normal XY coordinate system and you'd have Dukes of Hazard style traffic information, which might be kinda cool now that I think about it.

Source?

From Google blog post:

Quote:

If you use Google Maps for mobile with GPS enabled on your phone, that's exactly what you can do. When you choose to enable Google Maps with My Location, your phone sends anonymous bits of data back to Google describing how fast you're moving. When we combine your speed with the speed of other phones on the road, across thousands of phones moving around a city at any given time, we can get a pretty good picture of live traffic conditions. We continuously combine this data and send it back to you for free in the Google Maps traffic layers. It takes almost zero effort on your part — just turn on Google Maps for mobile before starting your car — and the more people that participate, the better the resulting traffic reports get for everybody.

It's already out there. If I recall correctly this is exactly how venues such as google maps displays traffic updates. Users phones (anonymously?) upload GPS location which pinpoints which road they are on along with velocity information. If a particular stretch of road is rated for 65MPH and phones are reporting only 3

"We anonymously combine speed and location information of GPS-enabled devices currently traveling on the road. This, combined with historic traffic data, helps us determine the traffic time estimate. IF you’d like to help make our estimates better through crowdsourcing and have a GPS-enabled phone, try using Google Maps for mobile the next time you’re in traffic. "

I didn't think anyone had any doubt about how this (incredibly great, I think) live traffic feature works in Google Maps.

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

It's already out there. If I recall correctly this is exactly how venues such as google maps displays traffic updates. Users phones (anonymously?) upload GPS location which pinpoints which road they are on along with velocity information. If a particular stretch of road is rated for 65MPH and phones are reporting only 30 MPH that would indicate congestion. A bunch of phones not creeping along and you might have a wreck. Add a Z axis to the maps normal XY coordinate system and you'd have Dukes of Hazard style traffic information, which might be kinda cool now that I think about it.

Source?

From Google blog post:

Quote:

If you use Google Maps for mobile with GPS enabled on your phone, that's exactly what you can do. When you choose to enable Google Maps with My Location, your phone sends anonymous bits of data back to Google describing how fast you're moving. When we combine your speed with the speed of other phones on the road, across thousands of phones moving around a city at any given time, we can get a pretty good picture of live traffic conditions. We continuously combine this data and send it back to you for free in the Google Maps traffic layers. It takes almost zero effort on your part — just turn on Google Maps for mobile before starting your car — and the more people that participate, the better the resulting traffic reports get for everybody.

I'm sure there is more detailed information out there, but this spot I remembered off the top of my head.

That's when folks turn on My Location. It isn't just for anyone with Google Maps installed. Granted, I don't know how many folks enable that but I don't think everyone does by any stretch of the imagination.

I have only one question. Is this going to be an experiment only on iPhone users? Okay, I lied. Here's a second question. If they're only experimenting on iPhone users, why? Are they trying to figure out what is wrong with iPhone users?

I have only one question. Is this going to be an experiment only on iPhone users? Okay, I lied. Here's a second question. If they're only experimenting on iPhone users, why? Are they trying to figure out what is wrong with iPhone users?

The Article wrote:

The application will allow iOS and Android smartphone users to participate actively by answering survey questions and passively through data collected from the phone's GPS and other sensors.

I helpfully added the bold, since this was apparently difficult for you to read the first time. No thanks necessary.

I have only one question. Is this going to be an experiment only on iPhone users? Okay, I lied. Here's a second question. If they're only experimenting on iPhone users, why? Are they trying to figure out what is wrong with iPhone users?

No, there's an Android version of the app planned.

It was trivial to find that out but, of course, being the Internet we can't expect to have to *OMG* click a link or something to find out.

E: Geez, I really should read threads before posting sometimes. CG was way ahead of me. I blame the narcotics in my system.

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

Not just governments. Insurance companies. Progressive's Snapshot dongle is already in-use, monitoring your car's activity via the OBDII port and reporting back to the mothership.

As for this thing...I'm not sure how whatever data they accumulate can ever been considered remotely accurate, given that many people simply aren't going to give themselves up to such intensive monitoring.

"We can capture the passive data off your phone—how far you move, how fast, etc.," Smolan said

I had not thought about that "how fast" portion of tracking; how long before we see governments request access to GPS speed data? This could potentially be an even more effective method of revenue generation than speed trap cameras (sarcasm).

Please don't stoke my dystopian side.

Wasn't there a big fuzz over Tom Tom navigation systems, who sold their clients GPS-/-speed data to European governments so their cops can set up speed-traps more effectively for more revenue ?

It's already out there. If I recall correctly this is exactly how venues such as google maps displays traffic updates. Users phones (anonymously?) upload GPS location which pinpoints which road they are on along with velocity information. If a particular stretch of road is rated for 65MPH and phones are reporting only 30 MPH that would indicate congestion.

No, it's not happening.

The GPS chipset has significant power requirements. Doing this would not only drain your battery, but it would heat your phone up to the point where it is noticeably hot, similar to playing a 3D game for an hour.

There is some tracking going on... but it's very minimal. For example an iPhone will send a list of visible cell towers and wifi networks to Apple's datacentre, and then Apple will respond with the GPS location of those towers/wifi networks. If you send 15 different towers/wifi networks (it sends all of the ones it can see, not just ones you are connected to) and Apple only knows where 13 of them are... then it will add the 2 missing entries to it's own database, assuming they are close by to the ones it does know about.

By putting all of this data together over a long period of time, they are able to get a fairly good lock on the position of all the towers and wifi networks. But they do not get a good lock on the location of any individual device - because the devices cache the data... if some time in the last several months your iPhone has already looked up the location of all the cell towers/wifi networks close to your home, then it will not ask for those locations again, it'll be able to calculate your geolocation without any network requests - preserving your privacy as much as possible.

Traffic data comes from a mix of voluntary crowdsourced data (such as Waze), and predominantly government funded efforts where they have sensors under the road. Pretty much every traffic light (in most countries of the world) has sensors under the road to help decide whether the light should be green or red right now. On streets with massive peak hour traffic, these will often also report the data back to the cloud, basically sending a count of how many cars have driven over in the last few minutes, and also how fast they were moving (all cars are about the same length, so by reading how long there was a car above the sensor you can get a pretty good idea how fast it is travelling).

It's already out there. If I recall correctly this is exactly how venues such as google maps displays traffic updates. Users phones (anonymously?) upload GPS location which pinpoints which road they are on along with velocity information. If a particular stretch of road is rated for 65MPH and phones are reporting only 30 MPH that would indicate congestion.

No, it's not happening.

The GPS chipset has significant power requirements. Doing this would not only drain your battery, but it would heat your phone up to the point where it is noticeably hot, similar to playing a 3D game for an hour.

??

You mean, like when I use my phone as my navigation device in my car? You say this like it makes using GPS for lengthy durations is infeasible, but I'm quite certain I do it for hours at a time not infrequently. I'll be doing it for five and a half hours tomorrow, in fact.

And once you're already doing that, I don't see any reason to think Google isn't using that data. After all, they're getting your location data the whole time; the maps and routefinding aren't done on-phone. It would certainly be the easiest way to get traffic data.

And since traffic data shows up on my phone in my city, and I happen to know from a guy who works in the DOT that we don't provide real time traffic information to third parties (e.g. Garmin, which is why paying for their traffic service in Madison is pointless), it seems a safe bet that this is how my phone can show me real time traffic conditions as it guides me to my destination.