Health record details exposed as 'de-identification' of data fails

One in 10 Australians' private health records have been unwittingly exposed by the Department of Health in an embarrassing blunder that includes potentially exposing if someone is on HIV medication, has terminated a pregnancy, or is seeing a psychologist.

Unique patient records matching the online public information of seven prominent Australians, including three former or current MPs and an AFL footballer, were revealed in a study by the University of Melbourne's School of Computing and Information Systems.

A report published on Monday by the university's Dr Chris Culnane, Dr Benjamin Rubinstein and Dr Vanessa Teague outlines how de-identified historical health data from the Australian Medicare Benefits Scheme (MBS) and the Pharmaceutical Benefits Scheme (PBS) released to the public in August 2016 can be re-identified using known information about the person to find their record.

Data made public as part of the Australian Medicare Benefits Scheme and the Pharmaceutical Benefits Scheme can be re-identified.

"We found that patients can be re-identified, without decryption, through a process of linking the unencrypted parts of the record with known information about the individual such as medical procedures and year of birth," Dr Culnane said.

Advertisement

"This shows the surprising ease with which de-identification can fail, highlighting the risky balance between data sharing and privacy."

While a unique match may not always be accurate, Dr Rubinstein said there was the possibility to improve confidence by cross-referencing other data.

"Because only 10 per cent of Australians are included in the sample data, there can be a coincidental resemblance to someone who isn't included," he said.

"We can improve confidence by cross-referencing with a second dataset of population-wide billing frequencies. We can also examine uniqueness according to the characteristics of commercial datasets we know of, such as bank billing data."

Privacy analyst and Lockstep consultant Stephen Wilson said the breach damaged public confidence in health policy makers and data custodians.

"It's a huge breach of trust," he said.

"Promises of 'de-identification' and 'anonymisation' made by health officials, and ABS too in connection with census data releases, have been shown to be erroneous.

"The ability to re-identify patients from this sort of public release is frankly, in my view, catastrophic. Real dangers are posed to patients with socially difficult conditions.

"It beggars belief that any official would promise 'anonymity' any more. These promises cannot be kept."

"In this case, clearly more work needs to be done to protect individuals' identities," he said.

"My hope is that the government embraces responsible research like this and strives to improve confidentiality rather than penalise those seeking to report deficiencies such as this."

The federal Department of Health was notified about the issue in December last year.

"The Department of Health takes this matter very seriously and had already referred this to the privacy commissioner," a Department of Health spokesperson told Fairfax Media.

"The project was halted and remains halted, and the dataset was removed immediately."

The spokesperson said the department had since taken further steps to protect and manage data.

"The department is working with the University of Melbourne and has already acted to improve its processes. The Department has not been aware of anyone being identified."

Meanwhile, the Office of the Australian Information Commissioner, which houses Australia's privacy commissioner, said it was investigating the publication of the datasets.

"The investigation was opened under section 40(2) of the Australian Privacy Act 1988 (Privacy Act) in late September 2016 when the Department of Health notified the OAIC that the datasets were potentially vulnerable to re-identification," a spokesperson said.

"Given the investigation into the Medicare Benefits Scheme (MBS) and Pharmaceutical Benefits Scheme (PBS) datasets is ongoing, we are unable to comment on it further at this time. However, the commissioner will make a public statement at the conclusion of the investigation."

The OAIC said it continued to work with Australian government agencies to enhance privacy protection in published datasets.

"A recent example is the De-identification Decision-Making Framework developed by CSIRO's Data61 and the OAIC. This provides guidance to Australian organisations that handle personal information on meeting their ethical responsibilities and legal obligations (such as those under the Privacy Act) when considering how datasets may be shared or released."

Dr Teague used the exposure to blast proposed changes to the national Privacy Act that would make it a criminal offence to re-identify government data that had been stripped of identifying markers.

"Legislating against re-identification will hide, not solve, mathematical problems, and have a chilling effect on both scientific research and wider public discourse," Dr Teague said.

Instead, Dr Teague said there were strong reasons to improve access to high-quality, and sometimes sensitive, data to facilitate research, innovation and sound public policy.

However, she argued there remained important technical and procedural problems to solve.

"Open publication of de-identified records like health, census, tax or Centrelink data is bound to fail as it is trying to achieve two inconsistent aims: the protection of individual privacy and publication of detailed individual records," Dr Teague said.

"We need a much more controlled release in a secure research environment, as well as the ability to provide patients greater control and visibility over their data."