Abstract

This work considers “Learning Using Privileged Information” (LUPI) paradigm. LUPI improves classification accuracy by incorporating additional information available at training time and not available during testing. In this contribution, the LUPI paradigm is tested on a Wide Area Motion Imagery (WAMI) dataset and on images from the Caltech 101 dataset. In both cases a consistent improvement in classification accuracy is observed. The results are discussed and the directions of future research are outlined.

Article Preview

Introduction

Data fusion (Mitchell, 2012; Hall & Llinas, 2001) refers to combining several sources of data in order to improve quality of extracted information. In the area of Intelligence, Surveillance, and Reconnaissance (ISR), data fusion is the mechanism for facilitating situational awareness by human analysts receiving inputs from multiple sensing platforms. The data fusion community uses standard fusion models, one of which is the Joint Directors of Laboratories (JDL) model (White, 1991), and another one is the Endsley’s situational awareness model (Endsley, 1995). These models have undergone multiple revisions but their main principles remain the same. The data from multiple sources go through several levels that combine information into concepts of increasing complexity. The first level (Level 0 in JDL model) aligns individual data elements in time and space. The next level (JDL Level 1) extracts objects from the data. The following levels (JDL Level 2 and 3) deal with assessment of situation, which is a collection of objects and relationships between them, and projecting the situation into the future in order to estimate possible impacts and make decisions based on the predictions.

The data fusion pipeline does not specify whether processing happens automatically or by the human operators. The state of the art currently is that only Level 0 and some of Level 1 processing can be effectively automated. Automatic detection and classification of objects (Level 1 of JDL model) in the sensor data, such as Electro-Optical (EO) imagery, Ground Moving Target Indicator (GMTI) radar or Synthetic Aperture Radar (SAR) imagery is still a challenging task and the high false alarm rates of the existing systems prevent full automation. Without reliable object classification, the automation of higher levels of the situational awareness model is problematic (Ilin & Perlovsky, Dynamic Logic learning in cognitive-based situation models, 2011). At the same time, ISR analysts are overwhelmed by the amount of data (Porche, Wilson, Johnson, Tierney, & Saltzman, 2014), and the need for reliable automatic processing is only going to increase in the future.

This works makes a contribution to automatic object classification and is motivated by the ISR data fusion challenges. We focus on processing of EO imagery. Nevertheless, the described methodologies are generic and applicable in many other fields. The focus of this work is on improved classification accuracy, which is a machine learning problem. Since the reason for improved accuracy lies in utilization of additional information possibly coming from multiple sensors, this falls under the data fusion paradigm.

Standard methodologies for classification of objects in optical imagery utilize the latest advances in the field of machine learning. Machine learning (Bishop, 2007) is a set of data exploitation techniques based on the idea of learning. In particular, the supervised learning paradigm consists of two stage processing. In the training stage, computer algorithms process labelled data, referred to as the training dataset. The labels allow the algorithms to learn the structure of the dataset. In the testing stage, the algorithms, also called classifiers, assign labels to previously unseen data, referred to as the testing dataset. Regular operational ISR environment is the testing stage for classifiers. Training is usually done off line on previously collected data.