The final exam is on Wednesday, May 5, from 10:00am to noon. It will cover the
material from the entire course, with a slight emphasis on the material from this sheet.

Chapter 4: Eigenvalues

§ 3:

Gram-Schmidt orthogonalization

We've seen how a basis consitsing of vectors orthogonal to one another can prove useful;
this section is about how to build such a basis.

The starting point is our old formula for the projection of one vector onto another;

v-[( < w,v > )/( < w,w > )]w is perpendicular to w.

Gram-Scmidt orthogonalization consists of repeatedly using this formula to replace a
collection of vectors with ones that are orthogonal to one, without changing
their span. Starting with a collection {v1,¼,vn} of vectors in V,

let w1 = v1, then let w2 = v2-[( < w1,v2 > )/( < w1,w1 > )]w1 .

Then w1 and w2 are orthogonal, and since w2 is a linear combination of
w1 = v1 and v2, while the above equation can also be rewritten to give
v2 as a linaear combination of w1 and w2, the span is unchanged. Continuing,

let w3 =
v3-[( < w1,v3 > )/( < w1,w1 > )]w1-[( < w2,v3 > )/( < w2,w2 > )]w2 ; then since w1 and w2 are
orthogonal, it is not hard to check that w3 is orthogonal to both of them,
and using the same argument, the span is unchanged (in this case, span{w1,w2,w3}
=span{w1,w2,v3}=span{v1,v2,v3}).

Doing this all the way to n will replace v1,¼,vn with orthogonal vectors
w1,¼,wn, without changing the span.

One thing worth noting is that the if two vectors are orthogonal, then any scalar
multiples of them are, too. This means that if the coordinates of one of our
wk are not to our satisfaction (having an ugly denomenator, perhaps),
we can scale it to change the coordinates to something more pleasant. It is interesting to
note that in so doing, the the later vectors wk are unchanged, since our scalar, can
be pulled out of both the top inner product and the bottom one in later calculations,
and cancelled.

of v into W. This vector is in W, and by the Gram-Schmidt argument,
v-projW(v)
is orthogonal to all of the wi, so it is orthogonal to every
linear combination, i.e., it is orthonal to every vector in W. As a result:

||v-projW(v)||£||v-w|| for every vector w in W. (**)

In the case that the wi are not just orthogonal but also orthnormal,
we can simplify this somewhat:

projW(v) = < w1,v > w1+¼+ < wn,v > wn = (w1w1T+¼+wnwnT)v =
Pv ,

where P = (w1w1T+¼+wnwnT) is the projection matrix giving us
orthogonal projection.

This projection matrix has three useful properties: (1) since it has
the property (**), the matrix you get will be the same no matter what orthonormal
basis you will use to build it; (2) it is symmetric (PT = P), and (3) it is
idempotent, meaning P2 = P (this is because the orthogonal projection of a vector
in W (e.g., Pv) is the same vector).

If we think of the vectors wi as the columns of a matrix A, then W = \cal C(A),
and so the result (**) is talking about the least squares solution to the equation
Ax = v ! The closest vector Ax to v is then Pv, which, looking at what we did
before, means that P = A(ATA)-1AT. This, however, makes sense even if the
columns of A are not orthogonal; if we picked orthonormal ones, and computed P,
we would still get the least squares solution, which this formula also
gives!

§ 4:

Orthogonal matrices

We've seen that having a basis consisting of orthonormal vectors can simplify some
of our previous calculations. Now we'll see where some of them come from.

An n×n matrix Q is called orthogonal if it's columns form an
orthonormal basis for Rn. This means
< (ith column of Q),(jth column of Q> = 1 if i = j, 0 otherwise .
This in turn means that QTQ = I, which in turn means QT = Q-1 !
So an orthogonal matrix is one whose inverse is equal to its own transpose.

A basic fact about a symmetric matrix A : if v1 and v2 are eigenvectors
for A with different eigenvalues l1,l2, then v1 and v2
are orthogonal.

This is a main ingredient needed to show: If A is a symmetric n×n matrix,
then A is always diagonalizable; in fact there is an orthonormal basis for
Rn consisting of eigenvectors of A. This means that the matrix P, with
AP = PD , whose columns are a basis of eigenvectors for A, can (when A is
symmetric) be chosen to be an orthogonal matrix.

Wow, short section.

§ 5:

Orthogonal complements

This notion of orthogonal vectors can even be used to reinterpret some of our
dearly-held results about systems of linear equations, where all of this stuff began.

Starting with Ax = 0, this can be interpreted as saying that
< (every row of A),x > =0, i.e., x is orthogonal to every row of
A. This in turn implies that x is orthogonal to every linear combination of
rows of A, i.e., x is orthogonal to every vector in the row space of A.

This leads us to introduce a new concept: the orthogonal complement
of a subspace W in a vector space V, denoted W^, is the collection
of vectors v with v^w for every vector w Î W. It is not hard to
see that these vectors form a subspace of V; the sum of two vectors orthogonal
to w, for example, is orthogonal to w, so the sum of two vectors
in W^ is also in W^ . The same is true for scalar multiples.

Some basic facts:

For every subspace W, WÇW^ = {0} (since anything in both is
orthogonal to itself, and only the 0-vector has that property).

Any vector v Î V can be written, uniquely, as v = w+w^, for w Î W and
w^Î W^ ; w in fact is projW(v) .
v-projW(v) will be in W^, more or less by definition of
projW(v) . The uniqueness comes from
the result above about intersections.

Even further, a basis for W and a basis for W^ together form a basis for
V; this implies that dim(W)+dim(W^) = dim(V) .

Finally, (W^)^ = W ; this is because W is contained in (W^)^
(a vector in W is orthogonal to every vector that is orthogonal to things in W),
and the dimensions of the two spaces are the same.

The importance that this has to systems of equations stems from the following facts:

\cal N(A) = \cal R(A)^ (this is what we noted, actually, at the beginning
of this section!)

\cal R(A) = \cal N(A)^

\cal C(A) = \cal N(AT)^

So, for example, to compute a basis for W^, start with a basis for W, writing
them as the columns of a matrix A, so W = \cal C(A), then
W^ = \cal C(A)^ = \cal R(AT)^ = \cal N(AT), which we know how
to compute a basis for!