Using Low Rank Models

By Madeleine Udell

[from the editor] A low rank model approximates a table as the (matrix) product of two numerical matrices X and Y. Every example (e.g., patient) is represented by a row of X; every feature (e.g., lab test) is represented by a column of Y. Learn more about low rank models and different types of low rank models in the article “Beyond Principal Components Analysis (PCA): Exploring Low Rank Models for Data Analysis”. In the short article below, the author describes how you can use low rank models to understand your data.

Every row of X represents an example; every column of Y represents a feature. Hence we can cluster the examples by clustering the rows of X, or cluster the features by clustering the rows of Y . We can plot the examples using the first two columns of X as coordinates, or the features using the first two rows of Y. The value predicted by a low rank model for jth measurement of the ith example is the dot product between the row of X corresponding to that example and the column of Y corresponding to that feature. Entries in the original data set that are very different from the value predicted by the low rank model might be suspected to be noisy or corrupted measurements.