Feature Representations

Shogun supports a wide range of feature representations. Among them are the so called simple features (cf., CSimpleFeatures) that are standard 2-d Matrices, strings (cf., CStringFeatures) that however in contrast to other meanings of string are just a list of vectors of arbitrary length and sparse features (cf., CSparseFeatures) to efficiently represent sparse matrices.

Each of these feature objects

Simple Features (CSimpleFeatures)

Strings (CStringFeatures)

Sparse Features (CSparseFeatures)

supports any of the standard types from bool to floats:

Supported Types

bool

8bit char

8bit Byte

16bit Integer

16bit Word

32bit Integer

32bit Unsigned Integer

32bit Float matrix

64bit Float matrix

96bit Float matrix

Many other feature types available. Some of them are based on the three basic feature types above, like CTOPFeatures (TOP Kernel features from CHMM), CFKFeatures (Fisher Kernel features from CHMM) and CRealFileFeatures (vectors fetched from a binary file). It should be noted that all feature objects are derived from CFeatures More complex

CAttributeFeatures - Features of attribute value pairs.

CCombinedDotFeatures - Features that allow stacking of dot features.

CCombinedFeatures - Features that allow stacking of arbitrary features.

CDotFeatures - Features that support a certain set of features (like multiplication with a scalar + addition to a dense vector). Examples are sparse and dense features.

CDummyFeatures - Features without content; Only number of vectors is known.

Classifiers

A multitude of Classifiers are implemented in shogun. Among them are several standard 2-class classifiers, 1-class classifiers and multi-class classifiers. Several of them are linear classifiers and SVMs. Among the fastest linear SVM-classifiers are CSGD, CSVMOcas and CLibLinear (capable of dealing with millions of examples and features).

Linear Classifiers

CPerceptron - standard online perceptron

CLDA - fishers linear discriminant

CLPM - linear programming machine (1-norm regularized SVM)

CLPBoost - linear programming machine using boosting on the features

CSVMPerf - a linear svm with l2-regularized bias

CLibLinear - a linear svm with l2-regularized bias

CSVMLin - a linear svm with l2-regularized bias

CSVMOcas - a linear svm with l2-regularized bias

CSubgradientSVM - SVM based on steepest subgradient descent

CSubgradientLPM - LPM based on steepest subgradient descent

Support Vector Machines

CSVMLight - A variant of SVMlight using pr_loqo as its internal solver.