Contents

Table of Contents

Training Invariant Support Vector Machines using Selective Sampling

Abstract:
Bordes et al (2005)
describe the efficient online LASVM algorithm using
selective sampling. On the other hand, Loosli et al. (2005) propose a
strategy for handling invariance in SVMs, also using selective sampling.
This paper combines the two approaches to build a very large SVM.
We present state-of-the-art results obtained on a handwritten
digit recognition problem with 8 millions examples on a single processor.
This work also demonstrates that online SVMs can effectively
handle really large databases.

Implementation Details

In response to various inquiries regarding the experimental setup:

All experiments were carried out on a dual Opteron machine
running 2.4GHz and equipped with 16GB of main memory.
The cache sizes were chosen to ensure that any experiment
would fit in 8GB allowing us to run two simultaneous
experiments on this computer. The memory usage consists of roughly 700MB of data
to generate the training examples on-the-fly (MNIST digits, Lie derivatives,
precomputed random vector fields), 500MB to cache transformed digits,
and 6.5GB of kernel cache.

The LASVM algorithm is implemented by reusing a few files from the distributed
LASVM source code (messages.c, kcache.c, and lasvm.c).
Documentation for these files is provided in the corresponding header files.
On-the-fly generation and caching of the training examples was realized
inside a highly optimized kernel function (undocumented).
This kernel function is simply passed to the kernel cache constructor
''lasvm_kcache_create()'. The glue code was written in Lush using the
standard LASVM bindings.

Datasets

The datasets used for these experiments were generated on the fly
by performing careful elastic deformation of the
original MNIST training set.

We used to provide two files containing the 8100000 examples
generated for our final experiment. Unfortunately these files
were accidentally deleted in 2014 from the NEC server that used
to host them. Instead of regenerating these examples, we found
more useful to package the code that was used to generate
them in the first place. We call this the
infinite MNIST dataset.

Note that there is not point trying to load such large files into the
distributed LASVM program. The distributed code uses a kernel representation
that was designed to perform like LIBSVM and is completely unsuitable
for this purpose. See the implementation details above.