Post-doc position in Stream Processing for Maritime Traffic Surveillance

Building and maintaining an updated report of the maritime traffic in a zone of interest or at the global scale is today an important strategic artifact in the surveillance of maritime territories. It is especially true for a country such as France, whose maritime area is the second one in terms of surface, in a context of growth of the traffic worldwide. These reports are crucial for the surveillance of borders, intrusion detections, illegal fishing activities, or for the detection of on-going environmental crimes.

Due to the scale of the areas to consider and the rapidly increasing traffic density, keeping this monitoring efficient requires to develop the ability to automate, at least partially, the detection of illegal activities, and, more generally of any unusual activity, to help operators to raise alerts and trigger their associated procedures. The recent development of different satellite systems, and of the AIS (Automatic Identification System) technology represents a significant step towards the gathering of information on the vessel traffic at the global scale. The AIS streams provide tens of millions of vessel positions each day and are growing in volume. By contrast, the computing resources at the disposal of the analysts of maritime traffic monitoring centers allow them to process only a very limited part of these streams.

Mission:

The SESAME project, started in 2017, is a French research project funded by the Direction Générale de l’Armement and scientifically coordinated by the Institut Mines Telecom Atlantique. The IRISA Lab is one the academic partners involved in the project. The main objective of SESAME is to develop new tools for the real time monitoring and analysis of traffic, for an increased efficiency of the monitoring process, through the combination of several data sources (from both satellites and the AIS streams), and by porting these tools to virtualized computing platforms.

The stream processing engines that were recently proposed (such as Storm, Spark or Flink) [1], are a general substrate for the real-time analysis, consolidation, and visualization of a rapid stream of data. They facilitate the programming of such a processing by providing a programming model easing the description of the workflow to be applied on each incoming new data element (commonly under the shape of a graph of operators). They also take care of the deployment of such a workflow over a pool of computing resources such as a cluster of processors.

In the context of the SESAME project, the goal is to develop an architecture for data stream processing in a large scale environment so as to generate knowledge out of raw AIS data, the constraints being that:

The AIS stream is not homogeneous. Different asynchronous AIS streams gathered from different points of the globe have to be cleaned, merged and reordered in real time.

The velocity of the AIS signals is changing and may require to dynamically calibrate the underlying processing platform

Different types of processing have to be considered: on one hand, near real-time processing is needed to detect on-going abnormal, dangerous, or illegal activity, for instance in coastal areas. On the other hand, batch processing may be required to keep an updated vision of the recent maritime activity in a given region.

The targeted knowledge includes to formally identify the navigational history of the vessels (the ports traversed, a synthesis of its movements in time). More generally, and at a longer term, a model of the traffic should be generated and kept up-to-date by the platform so as to be able to detect abnormal traffic more easily.

The envisioned tools to build such a platforms are Stream Processing Systems [1], virtualized infrastructure and their scheduling tools [2,3], and the Grid’5000 platform [4] as the main test-bed to deploy and test the developments.