Learning From Data – A Short Course: Exercise 9.6

Try to build some intuition for what the rotation is doing by using the illustrations in Figure 9.1 to qualitatively answer these questions.

(a) If there is a large offset (or bias) in both measured variables, how will this affect the ‘natural axes’, the ones to which the data will be rotated? Should you perform input centering before doing PCA?

I think we don’t need input centering before PCA. However, input centering does help simplifying variance calculation.

May 4, 2017 update: The answer is more complex than this, please check this.

(b) If one dimension (say ) is inflated disproportionately (e.g., income is measured in dollars instead of thousands of dollars). How will this affect the ‘natural axes’, the ones to which the data should be rotated? Should you perform input normalization before doing PCA?

Yes, I think we should perform input normalization before doing PCA, as will have a very large variance because of the unit it’s using rather than its nature.

(c) If you do input whitening, what will the ‘natural axes’ for the inputs be? Should you perform input whitening before doing PCA?

If I do input whitening before doing PCA, the ‘natural axes’ will stay the same after PCA as suggested by Figure 9.1. Further dicussion is in the book.