You are here

Creating a De-Identified Data Set

The HIPAA Privacy Rule protects Protected Health Information (PHI) by permitting only specific uses and disclosures of PHI allowed by the Rule, or as authorized by the subject of the information. However, a covered entity may create a de-identified dataset from PHI by following the HIPAA Privacy Rule's de-identification standard. The Health Information Privacy & Compliance Office and the IRB strongly encourage the use of de-identified datasets whenever possible.

To create a de-identified dataset that meets the HIPAA de-identification standard, a specific list of identifiers and derivatives of identifiers of individuals, as well as their relatives, employers, and household members must be removed. There can be no knowledge that the remaining information can be used alone or in combination with other information to identify the individual.

The following identifiers must be removed:

Names;

All geographic subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes (except that the initial three digits of a zip code may be used if, according to the current publicly available data from the Bureau of the Census, the geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people AND the initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000)

All elements of dates (except the year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including the year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older;