How papers leak

Karuna Pande Joshi

Picture AFP

And is there anything companies can do to prevent leaks?

The leak of 11 million confidential documents (in about 2.6 terabytes of data) from the Panamanian law firm Mossack, Fonseca and Co has been credited by some media outlets to a whistle-blower who wants to remain anonymous. Others have suggested it came from an attack by hackers.

If it was a whistle-blower, then like the Snowden leaks, the Manning leaks and even back to the Pentagon Papers leak in 1971, this incident once again underlines the dominance of insiders as the primary source of attacks on organizations’ critical and sensitive data sets. Organizations spend significant resources to firewall their data against attack by external hackers and cybercriminals. However, they have to often face the dichotomy of effectively preventing leaks of data by the very employees who need to use it for their daily work.

Often companies do not have adequate strategies to ensure data protection from insider threats. They are particularly vulnerable to people who have access to critical information, and violate the trust of their organization by sharing the data with media outlets or posting it on the Internet. These leaks can damage a company’s income and reputation. But how do they actually happen?

The legal, social and ethical issues of leaking are continually under discussion. But as a researcher of cybersecurity, I am most interested in how exactly these breaches are carried out.

How much data, really?

The first question that comes to mind is how such a large amount of data can be stolen. The Panama Papers leak is vastly more data than many previous leaks. And indeed, the journalists who received the data took more than a year to review it all.

But in computer terms it really isn’t very big. That amount of information can be stored in any standard hard drive available in the market today for around US$100. Portable USB drives storing 1 terabyte (TB) of data are also readily available on the market. So a potential whistle-blower could save 2.6 TB of data on a couple of encrypted hard drives or USB keys and drop it in the mail to journalists.

Worse, from the perspective of security-minded companies (and better, in the view of potential leakers), is the increasing capacity of smartphones. Devices with 128 GB of memory will soon be the norm, approaching the hard drive capacity of average laptops.

Anonymous sharing

Whistle-blowers want to share their data anonymously, so it cannot be traced back to them. They often use technologies like Tor, a distributed network of volunteer-run computers all around the world that enables users to set up anonymous encrypted channels of communication with either helpers or recipients of information.

Enforcement agencies cannot determine a Tor user’s browsing history or location. Users can delete any records their computers may keep from covert communications. This does require the whistle-blower to be conversant with the right technological solutions to be able to cover their tracks.

Preventing and detecting leaks

It is technically possible to prevent many leaks. Computer systems can be configured to prevent important documents from being printed, moved or copied – either as entire files or by a person who opens the document and copies and pastes the content into another file. Data can be encrypted so it can only be decoded on a single particular machine. Systems can be set to disable network connections when working with certain files, preventing them from being shared electronically.

But so severely restricting access to a document would make it very difficult to work with – especially if a file had multiple legitimate authors collaborating. Such a file would be effectively unusable: either employees would not use those systems or would find ways around the safeguards. Moreover, even those limits cannot ensure prevention of all leaks, because it is difficult to imagine all channels through which information may leak, especially covert channels.

Short of daily frisking every employee for unauthorized electronic devices, it is difficult to prevent breaches by a determined insider. Banning or limiting employees’ use of their personal phones would be largely impractical, and could hinder employee performance. Scanning every phone would be arduous. In any case, many companies give insiders permission to access private data and store it on their personal devices.

In today’s world, where employees often collaborate on documents and use shared data sets to complete their work, it is impossible to prevent an insider attack. Even logging every action on every file could serve as a deterrent, but there would still be no guarantee that it could reveal the identity of a culprit.

To determine when the next insider attack will occur, we likely need to look to the discipline of behavioral science, which is addressing this problem by attempting to identify people who might be leakers, rather than trying to restrict access to data or detect its duplication.

Whoever purloined the data behind the Panama Papers made a big impact but did not need to have any particularly deep technical skills. And whether that person was an insider or not, it’s clear that there are few technical fixes – and that more disclosures of this type are increasingly likely.