Methods for Anonymizing Patterns of Human Mobility

In recent years, the ability to efficiently gather location information of individuals has gained a lot of attention in the research community. There are multiple methods for collecting this data, but this thesis primarily considers data collected from base stations connected to the mobile phones used by people today. Because many users use mobile subscriptions, the demographic data of the users can be collected as well. However, to maintain the privacy of the individual, the collected data must be anonymized. The aim of this master’s thesis is to develop a method to anonymize the data so that it is not possible to identify an individual with a robability above a certain threshold, while still preserving as much information as possible. The anonymization is mainly divided into two parts. The first part anonymizes the data containing the movement of individuals, while the second part anonymizes the demographic data. The principle of k-anonymization was applied in both parts, which means that each entry in the output of the anonymization is indistinguishable from k −1 other entries. Hence, it is only possible to identify an individual with a probability of at most 1/k. For the anonymization of the demographic data a genetic algorithm was used which minimizes a new definition of information loss which is presented in this thesis. This definition was derived using the Kullback information.

Skapa referens, olika format (klipp och klistra)

BibTeX @mastersthesis{Nordström2012,author={Nordström, Martin},title={Methods for Anonymizing Patterns of Human Mobility},abstract={In recent years, the ability to efficiently gather location information of individuals has gained a lot of attention in the research community. There are multiple methods for collecting this data, but this thesis primarily considers data collected from base stations connected to the mobile phones used by people today. Because many users use mobile subscriptions, the demographic data of the users can be collected as well. However, to maintain the privacy of the individual, the collected data must be anonymized. The aim of this master’s thesis is to develop a method to anonymize the data so that it is not possible to identify an individual with a robability above a certain threshold, while still preserving as much information as possible. The anonymization is mainly divided into two parts. The first part anonymizes the data containing the movement of individuals, while the second part anonymizes the demographic data. The principle of k-anonymization was applied in both parts, which means that each entry in the output of the anonymization is indistinguishable from k −1 other entries. Hence, it is only possible to identify an individual with a probability of at most 1/k. For the anonymization of the demographic data a genetic algorithm was used which minimizes a new definition of information loss which is presented in this thesis. This definition was derived using the Kullback information. },publisher={Institutionen för energi och miljö, Fysisk resursteori, Chalmers tekniska högskola},place={Göteborg},year={2012},series={Rapportserie för Avdelningen för fysisk resursteori, no: 2012:2},note={52},}

RefWorks RT GenericSR ElectronicID 156392A1 Nordström, MartinT1 Methods for Anonymizing Patterns of Human MobilityYR 2012AB In recent years, the ability to efficiently gather location information of individuals has gained a lot of attention in the research community. There are multiple methods for collecting this data, but this thesis primarily considers data collected from base stations connected to the mobile phones used by people today. Because many users use mobile subscriptions, the demographic data of the users can be collected as well. However, to maintain the privacy of the individual, the collected data must be anonymized. The aim of this master’s thesis is to develop a method to anonymize the data so that it is not possible to identify an individual with a robability above a certain threshold, while still preserving as much information as possible. The anonymization is mainly divided into two parts. The first part anonymizes the data containing the movement of individuals, while the second part anonymizes the demographic data. The principle of k-anonymization was applied in both parts, which means that each entry in the output of the anonymization is indistinguishable from k −1 other entries. Hence, it is only possible to identify an individual with a probability of at most 1/k. For the anonymization of the demographic data a genetic algorithm was used which minimizes a new definition of information loss which is presented in this thesis. This definition was derived using the Kullback information. PB Institutionen för energi och miljö, Fysisk resursteori, Chalmers tekniska högskola,T3 Rapportserie för Avdelningen för fysisk resursteori, no: 2012:2LA engLK http://publications.lib.chalmers.se/records/fulltext/156392.pdfOL 30