Simple Exponential Family PCA

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:453-460, 2010.

Abstract

Bayesian principal component analysis (BPCA), a probabilistic reformulation of PCA with Bayesian model selection, is a systematic approach to determining the number of essential principal components (PCs) for data representation. However, it assumes that data are Gaussian distributed and thus it cannot handle all types of practical observations, e.g. integers and binary values. In this paper, we propose simple exponential family PCA (SePCA), a generalised family of probabilistic principal component analysers. SePCA employs exponential family distributions to handle general types of observations. By using Bayesian inference, SePCA also automatically discovers the number of essential PCs. We discuss techniques for fitting the model, develop the corresponding mixture model, and show the effectiveness of the model based on experiments.

Related Material

@InProceedings{pmlr-v9-li10b,
title = {Simple Exponential Family PCA},
author = {Jun Li and Dacheng Tao},
booktitle = {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics},
pages = {453--460},
year = {2010},
editor = {Yee Whye Teh and Mike Titterington},
volume = {9},
series = {Proceedings of Machine Learning Research},
address = {Chia Laguna Resort, Sardinia, Italy},
month = {13--15 May},
publisher = {PMLR},
pdf = {http://proceedings.mlr.press/v9/li10b/li10b.pdf},
url = {http://proceedings.mlr.press/v9/li10b.html},
abstract = {Bayesian principal component analysis (BPCA), a probabilistic reformulation of PCA with Bayesian model selection, is a systematic approach to determining the number of essential principal components (PCs) for data representation. However, it assumes that data are Gaussian distributed and thus it cannot handle all types of practical observations, e.g. integers and binary values. In this paper, we propose simple exponential family PCA (SePCA), a generalised family of probabilistic principal component analysers. SePCA employs exponential family distributions to handle general types of observations. By using Bayesian inference, SePCA also automatically discovers the number of essential PCs. We discuss techniques for fitting the model, develop the corresponding mixture model, and show the effectiveness of the model based on experiments.}
}

%0 Conference Paper
%T Simple Exponential Family PCA
%A Jun Li
%A Dacheng Tao
%B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2010
%E Yee Whye Teh
%E Mike Titterington
%F pmlr-v9-li10b
%I PMLR
%J Proceedings of Machine Learning Research
%P 453--460
%U http://proceedings.mlr.press
%V 9
%W PMLR
%X Bayesian principal component analysis (BPCA), a probabilistic reformulation of PCA with Bayesian model selection, is a systematic approach to determining the number of essential principal components (PCs) for data representation. However, it assumes that data are Gaussian distributed and thus it cannot handle all types of practical observations, e.g. integers and binary values. In this paper, we propose simple exponential family PCA (SePCA), a generalised family of probabilistic principal component analysers. SePCA employs exponential family distributions to handle general types of observations. By using Bayesian inference, SePCA also automatically discovers the number of essential PCs. We discuss techniques for fitting the model, develop the corresponding mixture model, and show the effectiveness of the model based on experiments.