Mathematics for the interested outsider

The Meaning of the SVD

where and are unitary transformations on and , respectively, and is a “diagonal” transformation, in the sense that its matrix looks like

where really is a nonsingular diagonal matrix.

So what’s it good for?

Well, it’s a very concrete representation of the first isomorphism theorem. Every transformation is decomposed into a projection, an isomorphism, and an inclusion. But here the projection and the inclusion are built up into unitary transformations (as we verified is always possible), and the isomorphism is the part of .

Incidentally, this means that we can read off the rank of from the number of rows in , while the nullity is the number of zero columns in .

More heuristically, this is saying we can break any transformation into three parts. First, picks out an orthonormal basis of “canonical input vectors”. Then handles the actual transformation, scaling the components in these directions, or killing them off entirely (for the zero columns). Finally, takes us out of the orthonormal basis of “canonical output vectors”. It tells us that if we’re allowed to pick the input and output bases separately, we kill off one subspace (the kernel) and can diagonalize the action on the remaining subspace.

So we can check ourselves by calculating . If this extends into the zero rows of there’s no possible way to satisfy the equation. That is, we can quickly see if the system is unsolvable. On the other hand, if lies completely within the nonzero rows of , it’s straightforward to solve this equation. We first write down the new transformation

where it’s not quite apparent from this block form, but we’ve also taken a transpose. That is, there are as many columns in as there are rows in , and vice versa. The upshot is that is a transformation which kills off the same kernel as does, but is otherwise the identity. Thus we can proceed

This “undoes” the scaling from . We can also replace the lower rows of with variables, since applying will kill them off anyway. Finally, we find

and, actually, a whole family of solutions for the variables we could introduce in the previous step. But this will at least give one solution, and then all the others differ from this one by a vector in , as usual.

About this weblog

This is mainly an expository blath, with occasional high-level excursions, humorous observations, rants, and musings. The main-line exposition should be accessible to the “Generally Interested Lay Audience”, as long as you trace the links back towards the basics. Check the sidebar for specific topics (under “Categories”).

I’m in the process of tweaking some aspects of the site to make it easier to refer back to older topics, so try to make the best of it for now.