Previous Competitions

The competition inaugurates an ongoing series from 2019 in which the data extent and analysis challenges are extended every year.

This year’s inaugural competition can be viewed as related to

Traffic forecasting: Our challenge stands in a long tradition of work analyzing data and predicting aspects related to traffic [7]. The competition that we now propose firstly stands out in terms of the scale of the data provided, matching the increased complexity of the predictive task. The challenge for the first time exploits a novel format in which we represent traffic data that naturally preserves the privacy of individual traffic participants.

Video frame prediction: Technically we can present traffic data over time as a movie, and traffic forecasts can thus be couched as frame prediction. Despite notable activities in this area, often deploying variational autoencoders (VAE) [30, 32, 16], or generative adversarial networks (GAN) [23, 29, 18] predicting future frames of videos remains a challenging problem due to the complexity and motion dynamics of natural scenes [18], especially for longer prediction horizons [16]. Frame sequences that represent an underlying geographical process rather than moving natural objects in common video scenes therefore present an additional exciting challenge to established methods for video frame prediction in this competition.

Let us consider this in the context of prior work, including related earlier competitions:

In addition, there have been competitions in which GPS-trajectory data have been used for prediction tasks. These, however, focused less on modelling traffic states but rather on the prediction of journey destinations, travel times (‘ETA’), or taxi fares. Further- more, there are a range of public GPS-datasets available, such as taxi trajectories collected in Rome, San Francisco, or Shanghai, and Microsoft’s GeoLife dataset. In general, however, these data sets are collected from only a few hundred probe vehicles in a single metropolitan area, and typically cover only few weeks or months. Conversely, the GeoLife data set only follows a small set of 182 individuals.

Our competition is thus innovative on several levels:

The scale of the data provided: In contrast to previous competitions and published data sets, we provide

Large-scale data that covers multiple full cities instead of individual road segments or single cities,

Real-world data reflecting actual observations collected by a large fleet of probe vehicles, rather than synthetic simulations

More densely sampled data through larger fleets, giving a better estimate of traffic properties throughout town

Multi-level periodic or seasonal effects, from intra-day and intra-week to longer term changes, e.g., summer vswinter

This gives a unique comprehensive longitudinal view of traffic states and their evolution over the course of shorter and longer times across multiple metropolitan areas with markedly different cultural and social backgrounds that will be reflected in the prevailing traffic patterns. Overall, the data that we will share with the scientific community is based on the unprecedented number of over 100 billion (1011) probe- points.

The ambition and complexity of the prediction challenge: The competition task is to predict not only one but multiple attributes of traffic state simultaneously – specifically speed, volume, and heading. Moreover, cells are not only characterized by attribute averages but summarize data distributionsin detail, while as a result preserving the privacy of individual traffic participants.

The detail andprivacy preserving representation and encoding of traffic data: In- troducing a novel approach to modelling traffic states, we provide traffic data in an aggregated, privacy preserving form that was compiled from individual real-world high-resolution GPS-trajectories. Recent societal and legal developments, as reflected for instance in the EU General Data Protection Regulation (GDPR) will increase the demand for analytic methods able to extract information from such aggregate data rather than always requiring precise and therefore highly sensitive movement data of private individuals.

In summary, while building on a tradition of traffic modelling, this proposal clearly goes beyond and above the state of the art in providing a much richer data basis. This allows and requires more complex analyses of the high-resolution yet aggregated, privacy-preserving data for the modelling and prediction of traffic states and their evolution at different time scales and multiple locations around the globe.

We challenge the community to explore and engage in a novel way of traffic forecasting, aggregating high-resolution trajectories in map grid cells and time bins that preserve much detail in characterizing traffic states while preserving individual privacy.