Friday, January 18, 2013

What iOS apps are grabbing your data

What iOS apps are grabbing your data, why they do it and what should be done

Early last week the personal diary app Path became the fulcrum of a massive discussion about how cavalier mobile apps are getting with harvesting your, presumably, personal information. Path was found by a developer to send the entire contents of its users Address Books, where, it was uncovered, it was being stored locally.

Predictably, when privacy issues are concerned, there was an outcry about how Path handled the data, and many decried it for being underhanded or even flat out lying about its procedures. But, as with most things, there is a bigger story here and it turns out that what Path was doing was far from out of the ordinary.

In fact, according to independent testing shared with The Next Web by developer Paul Haddad of Tapbots, many apps that are far more popular than Path are transmitting your data, and many of them have been doing it completely without your knowledge or consent until the Path story blew up, forcing them to immediately update their apps.

During testing, Haddad discovered that Foursquare, for instance, was uploading all of the email addresses and phone numbers in your address book with no warning and no explicit consent given. This is similar to the way that Path was working pre-update and the way that Instagram worked before its recent update.

You can see from the following list that transmitting your personal information is far from out of the ordinary.

Apps that do send data, with no warning

Foursquare stands alone here as an app that was, until an update issued on February 14th, sending personal data with no warning. This is similar to the previous behavior of Path that got it in so much hot water. Since the update, Foursqare now warns users before uploading data. Foursquare says that, while it was uploading the data, it was not storing it.

Foursquare (Email, Phone Numbers no warning)

Apps that do send data, after warning

Apps in this category send copious amounts of data including email addresses, first and last names and phone numbers. Path appears to be sending the most information out of the apps that were tested.

Path (Pretty much everything including mailing addresses, after warning)

Instagram (Email, Phone Numbers, First, Last , after warning)

Facebook (Email, Phone Numbers, First, Last, after warning)

Twitter for iOS (Email, Phone Numbers, after warning)

Voxer (Email, First, Last, Phone numbers, after warning)

Apps that appear not to send data

The apps in this category do not appear to send any information, though they all link against the Address Book framework in iOS. This means that they could theoretically grab the data to use locally, in a completely natural and normal way for apps that use contact info. There is a possibility that some interaction with the app could access the data and send it to the server, but no evidence of that was found in testing.

Google+ (Nothing obvious, links against Address Book)

Find My Friends (Nothing obvious, links against Address Book)

Skype (Impossible to tell, links against Address Book)

Yahoo! Messenger (Nothing obvious, links against Address Book)

Quora (Nothing obvious, links against Address Book)

Textfree (Nothing obvious , links against Address Book)

AIM (Nothing obvious, links against Address Book)

It’s clear that the lambasting that Path took in the internet press put the fear of god into any app that transmitted user data without explicit permission. This is probably a good thing, but there are still questions that need to be asked.

Why is the data needed?

It stands to reason that any applications that try to connect you with your friends might have to have access to the Address Book. That’s all well and good. But there are simple methods that allow developers to anonymize the data by ‘hashing’ the data and saving a checksum, ditching the plain text of the data, which never gets transmitted or stored on the servers at all.

This is a much more secure process than the barely-there security of an HTTPS transmission. If you’d like more detail, developer Matt Gemmell has prepared a simple and informative guide to the process.

The important point here is that developers do not have to upload plain text data to their servers in order to offer these convenience features. They can upload hashed, and therefore anonymous, data instead. Then they can use that data to provide the features without ever having seen or stored the plain information.

The answer is likely not that these developers are evil or looking to harvest your data. Instead, it’s likely to be a simple matter of them not understanding that there are better ways to go about it. Developers are only human and many teams have a limited amount of resources

“I think its a mix of being the easiest way to handle finding “friends” and lack of knowledge of alternative solutions,” Haddad told us.

We spoke to Impending’s Phill Ryu briefly and he supported that conclusion, saying that “maybe sometimes there’s intent to use it later, but I honestly think in most cases it’s just a mentality of ‘that disaster won’t happen to us, and better us working on fixing bugs in the product or adding new features”

That doesn’t, however, absolve developers from their handling of people’s personal data. There is also the matter of why it wasn’t disclosed that the apps were gathering the data after all. This kind of lack of disclosure is one of those things that serves to show that there is a genuine disconnect between the way that developers think about data and the way that users do.

To a developer, any bit of data that they can use to make the app work better or more seamlessly is treated as a resource to be harvested. Not to use in any evil way, but perhaps not treated with the sensitivity that should be afforded people’s personal data.

This is likely why, even when the decision was made to have the apps transmit data back, it wasn’t even thought of to disclose it or warn the user beforehand. It’s an incredible lapse of trust from a user standpoint, but the developers of these apps likely never thought twice. This brings up another really good question, whose job is it to make sure that your data is protected, the individual developer, or Apple?

Is it Apple’s job to force developers into data disclosure?

Buried at the heart of this multilayered issue is the fact that there is nothing built into the Address Book framework that forces a developer to ask the user before accessing and transmitting the data in this way. The onus is completely on them to make sure that they say ‘hey, we’re going to grab your data and use it for this feature’.

In all honesty, action needs to be taken on both ends. Apple, as the company that sets policies for its App Store, has already stated in its terms that apps need to be forward about how they handle user data and never to do it without first alerting the user. Unfortunately, as we can see from the apps detailed below, this has not worked.

Apple clearly needs to change its policies in order to make it harder for developers, even incidentally, to upload your personal data, period. But the responsibility is also on developers to be more conscientious.

“I think it’s Apple’s responsibility to change their API to make access to Address Book something that must be explicitly allowed by each user,” Haddad told us, “I think its a developer’s responsibility to clearly lay out exactly what data is being sent and for what reasons.”

There’s no simple solution here. As developer Justin Williams notes, there are issues with simply adding another ‘warning’ box:

Tossing up another dialog asking for user confirmation doesn’t solve the problem users are faced with. It just puts a band-aid on it. At the core is a more fundamental problem in how iOS handles permissions and access to data. Basically, I have no idea what sort of permissions or access an app wants until I download it and launch it the first time. Moreover, I really don’t to see another dialog pop up in my face as I’m using an app.

He suggests that the App Store needs more transparency when it comes to permissions, allowing users to see what data the application will be using right up front. This is similar to the way that the Android Market and Windows Marketplace work on their respective platforms.

No easy answer

There is no ‘magic button’ to be had here. Yes, Apple will very likely have to begin enforcing access to the Address Book at the system level. But there are dangers that this will lead to ‘overload’, causing users to tap right by the warning and putting us in the same spot we are now.

Developers also need to be more conscious of privacy issues, and just how sensitive they are to most users. The disconnect between how users value personal data and how developers treat that data needs to be bridged.

Until that happens, we’ll keep talking about these kinds of privacy issues.

Now, the Data

Since we found the data interesting and it really makes an impact when you see it right there in your face, we decided to include Haddad’s testing methodology, if you’d like to reproduce it, along with the data intercepted during the test.

Note that all of the data you see here was transmitted directly over HTTPS, which is relatively secure, but the data is easily (as is demonstrated by the translations below) readable once on the server. This means that if can easily be used by anyone who gains access to the server and, of course, the companies themselves.

Haddad explains his testing methods:

All testing was done using the Charles web proxy. The device was restored to a blank state and two dummy contacts were added. The Charles SSL certificate was installed and accepted. And the proxy settings for iOS were set to point to a machine running Charles.

Each app was then launched and for most new accounts were created. For some the actions that caused Address Book data to be sent happened right away, for others I had to navigate through a few different screens. Then I searched through all the HTTPS requests to see if I could find any of the dummy address data.

As mentioned above, Foursquare updated its app earlier today to insert a warning before transmitting the data. Previous do this it was transmitting it with no warning, similar to the way that Path and Instagram had been working.

While Twitter does warn you before you give away your info, it has also admitted that it stores your data locally, on its severs for up to 18 months. That data can include “IP address, browser type, the referring domain, pages visited, your mobile carrier, device and application IDs, and search terms.”