Abstract

p-Sensitive k-anonymity model has been recently defined as a sophistication of k-anonymity. This new property requires that there be at least p distinct values for each sensitive attribute within the records sharing a set of quasi-identifier attributes. In this paper, we identify the situations when the p-sensitive k-anonymity property is not enough for the sensitive attributes protection. To overcome the shortcoming of the p-sensitive k-anonymity principle, we propose two new enhanced privacy requirements, namely
pþ-sensitive k-anonymity and ðp; aÞ-sensitive k-anonymity properties. These two new introduced models
target at different perspectives. Instead of focusing on the specific values of sensitive attributes, pþ-sensitive
k-anonymity model concerns more about the categories that the values belong to. Although ðp; aÞsensitive
k-anonymity model still put the point on the specific values, it includes an ordinal metric system to measure how much the specific sensitive attribute values contribute to each QI-group. We make a thorough theoretical analysis of hardness in computing the data set that satisfies either pþ-sensitive kanonymity or ðp; aÞ-sensitive k-anonymity. We devise a set of algorithms using the idea of top-down specification, which is clearly illustrated in the paper. We implement our algorithms on two real-world data sets and show in the comprehensive experimental evaluations that the two new introduced models are superior to the previous method in terms of effectiveness and efficiency.