Supervised Probabilistic Principal Component Analysis

2006

Conference Paper

ei

Principal component analysis (PCA) has been extensively applied in
data mining, pattern recognition and information retrieval for
unsupervised dimensionality reduction. When labels of data are
available, e.g.,~in a classification or regression task, PCA is however not able to use this information. The problem is more interesting if only part of the input data are labeled, i.e.,~in a
semi-supervised setting. In this paper we propose a supervised PCA
model called SPPCA and a semi-supervised PCA model called S$^2$PPCA, both of which are extensions of a probabilistic PCA model. The proposed models are able to incorporate the label information into
the projection phase, and can naturally handle multiple outputs
(i.e.,~in multi-task learning problems). We derive an efficient EM
learning algorithm for both models, and also provide theoretical
justifications of the model behaviors. SPPCA and S$^2$PPCA are
compared with other supervised projection methods on various
learning tasks, and show not only promising performance but also
good scalability.

Author(s):

Yu, S. and Yu, K. and Tresp, V. and Kriegel, H-P. and Wu, M.

Book Title:

KDD 2006

Journal:

Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006)

People

Share

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems