Only a few bad apples?

November 2015

Apple recently announced that it had started to remove hundreds of apps from its App Store because they were sending personal data without the users’ knowledge or permission. Research by SourceDNA, building on a previous study conducted by researchers at Purdue University, found that personal information about the apps’ users, including email addresses and device identifiers, was being sent directly to the servers of Chinese advertising agency Youmi. Software developers themselves may have been entirely unaware of the fact data collection happened through code automatically inserted by the software development kit (SDK) provided by Youmi.

The presence of apps on Apple’s official App Store that were violating its security and privacy guidelines suggests that enforcement of these guidelines has not been fully effective. This must be a concern as end users cannot be expected to unpick precisely what information an app collects, and how that information is being used. They need to trust these platforms not to distribute apps that would leak personal data.

Checks and balances

App platforms (such as Google Play and the Apple App Store) set out privacy and security guidelines with which app developers need to comply. They perform more or less stringent checks and monitor user complaints and user feedback to identify problems, but rely to a large extent on developers sticking to contractual obligations and data protection laws. Guidelines provided by the UK Information Commissioner’s Office (ICO) state that app developers are “responsible for understanding the behaviour of any software components” that they incorporate into their apps, pointing out that “some app development frameworks include code for the purpose of advertising, which may process personal data.”1

When we looked into the collection and use of consumer data by mobile apps in a study for the UK Competition and Markets Authority (CMA) earlier this year, we found that the use of complex third party code is common in the development of apps, and that there is sometimes a lack of transparency in the descriptions of third party libraries and SDKs. Whilst some app developers will use their own proprietary code, others rely on third party libraries or SDKs for providing the building blocks for app development. Advertisers, analytics firms or other service providers such as Facebook happily offer such libraries and SDKs ‘for free’, getting their return through in-app advertising, app analytics and other services and features. Having access to such tools has helped a large number of developers to turn their ideas into marketable software and increase competition and choice for end users.

However, there are also downsides. In our report we raised the concern that developers may not be fully aware of the details of how such third party code works. Even if developers tried to understand what information their apps would be collecting, or what control the SDK provider might have over the way the app works, descriptions may not always be helpful. This was a particular concern for the gaming sector where a large number of small scale developers are active, with many of them relying on third-party libraries. Unlike larger firms, these small scale developers may also be less concerned about the damage to their reputation from failing to comply with the security guidelines set out by the platforms. They may therefore put less effort in trying to understand the possible risks and side effects of the tools they are using.

At the time of our report, we considered it difficult to verify whether app developers comprehensively adhere to the relevant policies and whether developers are using data for the stated purpose. When we raised this issue in an interview with the UK ICO, the ICO considered that many large developers should be sufficiently aware and conscious of data protection legislation. However, it also acknowledged that there is a risk that ‘amateur’ developers may not be informed about the exact type and volume of data collected when using third-party code and thus not able to tell users what personal data would be processed by their app. The ICO said it was working to improve awareness amongst app developers by providing guidance on data protection laws.

As long as awareness of the potential risks associated with the use of ‘free’ third party tools has some catching up to do, there is a need for more stringent checks to ensure that users are protected. Whilst Apple reviews all apps submitted to its App Store (and claims to reject apps that use non-public APIs2, as set out in its guidelines, hundreds of apps with non-public APIs have passed the review system and have been able to collect personal information without user permission.

The Purdue researchers state that “[w]hile Apple has never publicly disclosed the technical details of App Review, these attacks indicate that the current vetting process may be based on static analysis which is vulnerable to obfuscation.” They propose a new iOS application review system called iRiS that uses an iterative dynamic analysis approach. It is this system that allowed the researchers to find that 150 out of of 2019 applications on the official App Store they examined were using non-public APIs, including 25 security-critical APIs that access sensitive user information, such as device serial number, and that Youmi was collecting user privacy information through its advertisement-serving library. Whilst this dynamic approach is more powerful compared with static analysis, it also slower and therefore more costly.

Apple’s decision to remove the offending apps is a first step, but there is clearly a need for more and more thorough checks going forward. Moreover, end users should be aware of the efforts taken by platforms to check their apps – and the instances where they fail. Ultimately, trust in the effective checking of apps by the platforms is key to the success of the whole ecosystem. This trust has to be earned, and the fear of losing it is a constant incentive for avoiding complacency – but only if end users value the efforts that go into checking and verifying. And a necessary condition for this is that they understand what platforms do and where they fail.

This is because non-public APIs may be able to access sensitive information without being detected. As Deng et al. note, “[p]rivate APIs are functions in iOS frameworks reserved only for internal uses in built-in applications. They provide access to various device resources e.g. camera, bluetooth and sensitive information e.g. serial number, device ID, which are often not regulated by runtime mechanisms. Although some resources are guarded by entitlements with MAC in recent versions of iOS, there are still many that can be accessed without mediation.” [↩]

DotEcon Online Seminars

DotEcon runs an online seminar programme about topics of general interest linked to our work. Past online seminars have covered, for example, spectrum auction design, e-commerce and its impact on competition, or the role of regulators in disputes amongst firms.Our seminars are by invitation only, with invitations usually being sent around a week before the seminar is run. If you would like more information or be kept up-do-date with the programme in general, please send us an email.