User Movement Simulations Project

Large-Scale Human Mobility Simulations*

Abstract

The evaluation of privacy-preserving techniques for LBS is often based on simulations of mostly random user movements that only partially capture real deployment scenarios. Our results show that, compared to the context-aware simulator, the random user movement simulator leads to significantly different results for a spatial-cloaking algorithm, under-protecting in some cases, and over-protecting in others [1]. Indeed using a context simulator it is possible to design models for agents, places and context; for example, it is possible to define particular places of aggregation and make users dynamically choose which place to reach and how long to stay in that place. This behavior, which is not modeled by mostly random user movement simulators, can significantly affect the results of the spatial-cloaking algorithm.

In our research we created several datasets of simulated user movements using a personalized version of the SIAFU agent-based context-aware simulator [2]. Each simulation is specifically designed for a particular family of LBSs. In this page we briefly describe some of these datasets and we make them publicly available. For a more detailed description of the context-simulator we used to generate the datasets, please see [1].

The simulation includes a total of 30,000 home buildings, 10,000 office buildings and 1,000 entertainment places; the first two values are strictly related to the considered number of inhabitants of Milan, while the third is based on real data from public sources which also provide the geographical distribution of the places. Note that the distribution of home and buildings are Gaussian distributions in which home buildings are more concentrated in the outskirt of the city while office buildings are more concentrated in the center of the city.

Following the study reported in [3], we fixed the average speed for users moving by car to 20km/h. We also fixed the average speed of users moving on foot to 3,6km/h.

The MilanoByNight simulation

In this simulation, we considered a typical deployment scenario for a friend-finder service: a large number of young people using the service on a weekend night in large city like Milan. We performed a deep study, using different sources, including on-line surveys, of the parameters characterizing this scenario.

All probabilities related to agents' choices are modeled with a probability distributions. For this specific data generation, some of the important parameters of the simulation are:

Source and destination. These are the locations essential to define movements. They may be homes or entertainment places. Some places in some districts are more popular than others.

StartingTime. The time at which a user leaves her home to go to the first entertainment place.

Permanence. How long will a user stay at one entertainment place?

NumPlaces. How many entertainment places will a user visit on one night?

In order to have a realistic model of these distributions, we prepared a survey to collect real users data. We are still collecting data, but the parameters used in the simulation are based on interviews of more than 300 people in our target category.

Available datasets:

DataSet 1

Number of agents: 100.000

Simulation duration: 6 hours (from 7pm to 1am)

Interval between two consecutive instants: 2 minutes

How to obtain the datasets

The access to the datasets is free. In order to have access to the datasets, send a request. You will be asked to fill in a form, and then you will receive the link to datasets.