Open-Box Training of Kernel Support Vector Machines: Opportunities and Limitations

Abstract

Kernel Support Vector Machines (SVMs) are widely used for supervised classification, and have achieved state-of-the-art performance in numerous applications. We aim to further increase their efficacy by allowing a human operator to steer their training process. To this end, we identify several possible strategies for meaningful human intervention in their training, propose a corresponding visual analytics workflow, and implement it in a prototype system. Initial results from two users, on data from three different domains suggest that, in addition to facilitating better insight into the data and into the classifier’s decision process, visual analytics can increase the efficacy of Support Vector Machines when the data available for training has a low number of samples, is unbalanced with respect to the different classes, contains outliers, irrelevant features, or mislabeled samples. However, we also discuss some limitations of improving the efficacy of supervised classification with visual analytics.

Official version at https://doi.org/10.2312/vmv.20191319

Images

Download Paper

Bibtex

@INPROCEEDINGS{Khatami:VMV2019,
author = {Khatami, Mohammad and Schultz, Thomas},
pages = {63--72},
title = {Open-Box Training of Kernel Support Vector Machines: Opportunities and Limitations},
booktitle = {Vision, Modeling {\&} Visualization},
year = {2019},
abstract = {Kernel Support Vector Machines (SVMs) are widely used for supervised classification, and have
achieved state-of-the-art performance in numerous applications. We aim to further increase their
efficacy by allowing a human operator to steer their training process. To this end, we identify
several possible strategies for meaningful human intervention in their training, propose a
corresponding visual analytics workflow, and implement it in a prototype system. Initial results
from two users, on data from three different domains suggest that, in addition to facilitating
better insight into the data and into the classifier’s decision process, visual analytics can
increase the efficacy of Support Vector Machines when the data available for training has a low
number of samples, is unbalanced with respect to the different classes, contains outliers,
irrelevant features, or mislabeled samples. However, we also discuss some limitations of improving
the efficacy of supervised classification with visual analytics.}
}