About:
We study the problem of robust feature extraction based on L21 regularized correntropy in both theoretical and algorithmic manner. In theoretical part, we point out that an L21-norm minimization can be justified from the viewpoint of half-quadratic (HQ) optimization, which facilitates convergence study and algorithmic development. In particular, a general formulation is accordingly proposed to unify L1-norm and L21-norm minimization within a common framework. In algorithmic part, we propose an L21 regularized correntropy algorithm to extract informative features meanwhile to remove outliers from training data. A new alternate minimization algorithm is also developed to optimize the non-convex correntropy objective. In terms of face recognition, we apply the proposed method to obtain an appearance-based model, called Sparse-Fisherfaces. Extensive experiments show that our method can select robust and sparse features, and outperforms several state-of-the-art subspace methods on largescale and open face recognition datasets.

About:
Oboe is a software for Chinese syntactic parsing, and it can display syntactic trees in a graphical view with two kinds of representation: phrase tree and dependency tree. So it is very helpful for NLP researchers, especially for researchers focusing on syntax-based methods.

About:
Locally Weighted Projection Regression (LWPR) is a recent algorithm that achieves nonlinear function approximation in high dimensional spaces with redundant and irrelevant input dimensions. At its [...]

About:
In this paper, we propose an improved principal component analysis based on maximum entropy (MaxEnt) preservation, called MaxEnt-PCA, which is derived from a Parzen window estimation of
Renyi’s quadratic entropy. Instead of minimizing the reconstruction error either based on L2-norm or L1-norm, the MaxEnt-PCA attempts to preserve as much as possible the uncertainty information of the data measured by entropy. The optimal solution of MaxEnt-PCA consists of the eigenvectors of a
Laplacian probability matrix corresponding to the MaxEnt distribution. MaxEnt-PCA (1) is rotation invariant, (2) is free from any distribution assumption, and (3) is robust to outliers. Extensive experiments on real-world datasets demonstrate the effectiveness of the proposed linear method as compared to other related robust PCA methods.

About:
Urheen is a toolkit for Chinese word segmentation, Chinese pos tagging, English tokenize, and English pos tagging. The Chinese word segmentation and pos tagging modules are trained with the Chinese Tree Bank 7.0. The English pos tagging module is trained with the WSJ English treebank(02-23).

About:
OpenPR-NBEM is an C++ implementation of Naive Bayes Classifier, which is a well-known generative classification algorithm for the application such as text classification. The Naive Bayes algorithm requires the probabilistic distribution to be discrete. OpenPR-NBEM uses the multinomial event model for representation. The maximum likelihood estimate is used for supervised learning, and the expectation-maximization estimate is used for semi-supervised and un-supervised learning.

About:
This is a class to calculate histogram of LBP (local binary patterns) from an input image, histograms of LBP-TOP (local binary patterns on three orthogonal planes) from an image sequence, histogram of the rotation invariant VLBP (volume local binary patterns) or uniform rotation invariant VLBP from an image sequence.

About:
This program implements a novel robust sparse representation method, called the two-stage sparse representation (TSR), for
robust recognition on a large-scale database. Based on the divide and conquer strategy, TSR divides the procedure of robust recognition into outlier detection stage and recognition stage. The extensive numerical experiments on several public databases demonstrate that the proposed TSR approach generally obtains better classification accuracy than the state-of-the-art Sparse Representation Classification (SRC). At the same time, by using the TSR, a significant reduction of computational cost is reached by over fifty times in comparison with the SRC, which enables the TSR to be deployed more suitably for large-scale dataset.