Re-identification Risk and Proactive Disclosure of Data for Open Government: Lessons from the Supreme Court of Canada?

On April 24, 2014 the Supreme Court of Canada handed down a decision which at least touches upon the thorny question of what constitutes “personal information”. This question is particularly important to governments that are contemplating the proactive release of government data under commitments to open government. The issue is far from academic, as federal, provincial and municipal governments in Canada have all taken steps in this direction. Indeed, the Ontario government has just signaled its own commitment to open government, which will include proactive disclosure of government data.

Most public sector data protection laws in Canada define “personal information” as essentially information about an identifiable individual. This means that “personal information” is more than just information that actually identifies an individual (their name or social insurance number, for example) but also includes any other information that, if linked with other available information, could lead to the identification of a specific individual. Thus, a government contemplating the proactive disclosure of data sets under an open data program, would have to ensure that the data sets were free not just of individuals’ names and identification numbers, but also free of data that could be linked back to specific individuals. This can be more challenging than one might think, particularly as we live in an environment where more and more data is becoming easily available from both public and private sector sources, and where search engines, algorithms and computing power make mining and matching information increasingly fast, inexpensive and easy.

The case – Ministry of Community Safety and Correctional Services v. Information and Privacy Commissioner (Ontario) – involved an access to information request made by a journalist to the Ministry of Community Safety and Correctional Services. The journalist sought the disclosure of the number of registered sex offenders in Ontario who lived within each postal code forward sortation area (the area designated by the first three digits of a postal code). The journalist did not seek access to this information by full postal code, presumably because this finer level of detail might lead to the identification of those individuals, particularly where there were relatively few residences associated with a particular postal code. While Ontario maintains a sex offender registry, the locations of the registered sex offenders are not public information. The register is intended primarily for use by law enforcement officials. The journalist planned to create a map which would allow the public to see a more generalized geographic representation of where registered sex offenders in Ontario were living. The Ministry refused to disclose this information on the basis that it could lead to the identification of specific individuals. It argued not just that the information could not be disclosed because it fell within the definition of “personal information” but also because its release would interfere with law enforcement, endanger the life or physical safety of the individuals, and might hamper the control of crime (by making sex offenders less likely to comply with the registration requirements out of fear of identification). All of these bases are exceptions to disclosure of information under the provinces Freedom of Information and Protection of Privacy Act (FIPPA). The Ministry’s refusal was appealed by the journalist to the Office of the Information and Privacy Commissioner, which ordered that the information be disclosed. The Commissioner’s decision was upheld by the courts all the way up to the Supreme Court of Canada, which also upheld the order to release the information sought by the journalist.

The Supreme Court considered three issues: the level of deference due to the decision of the Information Commissioner, whether access was ordered for purposes inconsistent either with FIPPA or with the law governing the sex offender registry, and whether the Commissioner’s interpretation of the scope of the law enforcement exceptions to information disclosure was appropriate. Yet underlying these issues was a key question which itself was not in dispute before the Court. This was whether the information sought constituted personal information – in other words, information about an identifiable individual. The approach of the Commissioner to this question was not part of the appeal, yet once it was accepted that the information sought was not personal information, it would be difficult to find that any of the harm-based exceptions to disclosure would apply – no matter what interpretation they were given – because information that could not lead to the identification of specific individuals would be highly unlikely to cause them harm and, in theory at least, less likely to deter them from complying with the registry requirements.

In refusing to disclose the information, the Ministry had argued that the information being sought was personal information because it could be linked with other available information in order to re-identify individuals. This issue of the potential for re-identification is central to the question of whether information qualifies as a personal information, and in the context of open data, it will be crucial in decisions about whether certain data sets may be proactively disclosed. It is important to note that the Commissioner in this case observed that the Ministry had not advanced any cogent evidence of the potential for re-identification. This point was picked up by the courts below, and the Supreme Court of Canada agreed. Writing for the Court, Justice Cromwell noted that “the Commissioner determined that the Ministry did not provide any specific evidence explaining how the Record could be cross-referenced with other information in order to identify sex offenders. We find this to be a reasonable determination.” (at para 60) Indeed, very little specific evidence was provided, and the court dismissed more general literature on re-identification as “unconvincing and generic scholarly research on ‘identifiability’.” (at para 60) The Court also agreed with the Commissioner’s rejection of the Ministry’s assertions that more information might someday be available on the Internet that could, if matched with the sex offender data, lead to identification. Justice Cromwell stated: “it must be stressed that the Ministry only referred vaguely to the unpredictability of internet developments and did not provide any specifics about how identification could occur.” (at para 61).

The case involved a dispute over the release of data in the context of a specific access to information request. Yet there are lessons here for those tasked with identifying data sets for proactive release for the purposes of open data. These might be summarized as follows:

The definition of “personal information” under access to information and data protection laws in Canada tends to be broad and will include information about an identifiable individual. Thus it is important to consider not just whether there are specific personal identifiers in the data set, but whether this data could, if matched with other available data, lead to the identification of specific individuals.

In considering the likelihood of re-identification, it would seem that the balance between the goals of transparency and accountability and those of privacy protection do not require ‘worst case scenario’ or extreme hypothetical speculation. In the access to information context, the department or agency in question would bear the burden of justifying a refusal to disclose, and justification must be more than assertions. Presumably the same standard would apply to proactive disclosure. The law does not require excessive caution.

In considering the possibility of re-identification, it might also be appropriate to consider how sensitive the personal information would be if re-identification were achieved. In Ministry of Community Safety, the information at risk of disclosure through re-identification was highly sensitive – the location of registered sex offenders. Presumably this might give some an incentive to attempt re-identification, perhaps warranting greater caution in the decision about whether to release the information. Nevertheless, the Commissioner, and ultimately the courts, still required some evidence that re-identification was possible.