Application and additional details

Description

The field of complex data analysis (data science) is interested in the extraction of suitable indicators used for dimension reduction, data comparison or classification. Initially based on application-dependent, physics-based descriptors or features, novel methods employ more generic and potentially multiscale descriptors, that can be used for machine learning or classification. Examples are to be found in SIFT-like (scale-invariant feature transform) techniques (ORB, SURF), in unsupervised or deep learning. The present internship focuses on the framework of scattering transform (S. Mallat et al.) and the associated classification techniques. It yields signal, image or graph representations with invariance properties relative to data-modifying transformations: translation, rotation, scale… Its performances have been well-studied for classical data (audio signals, image databases, handwritten digit recognition).

This internship aims at dealing with lesser studied data: identification of the closest match to a template image inside a database of underground models, extraction of suitable fingerprints from 1D spectrometric signals from complex chemical compounds for macroscopic chemometric property learning.

The stake in the first case resides in the different scale and nature of template and model images, the latter being sketch- or cartoon-like versions of the templates. In the second case, signals are composed of a superposition of hundreds of (positive) peaks. Their nature differs from standard information processed by scattering transforms. A focus on one of the proposed applications can be considered, depending on success or difficulties met.