Section CB Change of Basis

We have seen in Section MR that a linear transformation can be represented by
a matrix, once we pick bases for the domain and codomain. How does the matrix
representation change if we choose different bases? Which bases lead to especially
nice representations? From the infinite possibilities, what is the best possible
representation? This section will begin to answer these questions. But first we
need to define eigenvalues for linear transformations and the change-of-basis
matrix.

We now define the notion of an eigenvalue and eigenvector of a linear
transformation. It should not be too surprising, especially if you remind yourself
of the close relationship between matrices and linear transformations.

So {w}_{1},
{w}_{2},
{w}_{3} are eigenvectors of
R with eigenvalues
(respectively) {λ}_{1} = 3,
{λ}_{2} = 0,
{λ}_{3} = −1. Notice how the eigenvalue
{λ}_{2} = 0 indicates that the
eigenvector {w}_{2} is a non-trivial
element of the kernel of R, and
therefore R is not injective
(Exercise CB.T15). ⊠

Of course, these examples are meant only to illustrate the definition of
eigenvectors and eigenvalues for linear transformations, and therefore beg the
question, “How would I find eigenvectors?” We’ll have an answer before we finish
this section. We need one more construction first.

Subsection CBM: Change-of-Basis Matrix

Given a vector space, we know we can usually find many different bases for the
vector space, some nice, some nasty. If we choose a single vector from this vector
space, we can build many different representations of the vector by constructing
the representations relative to different bases. How are these different
representations related to each other? A change-of-basis matrix answers this
question.

Definition CBMChange-of-Basis Matrix Suppose that V is a vector
space, and {I}_{V }: V → V is the identity
linear transformation on V .
Let B = \left \{{v}_{1},\kern 1.95872pt {v}_{2},\kern 1.95872pt {v}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {v}_{n}\right \} and
C be two bases of
V . Then the change-of-basismatrix from B
to C is the matrix
representation of {I}_{V }
relative to B
and C,

Notice that this definition is primarily about a single vector space
(V ) and two
bases of V
(B,
C). The linear
transformation ({I}_{V })
is necessary but not critical. As you might expect, this matrix has something to
do with changing bases. Here is the theorem that gives the matrix its name (not
the other way around).

Theorem CBChange-of-Basis Suppose that v is a vector
in the vector space V
and B and
C are
bases of V .
Then

So the change-of-basis matrix can be used with matrix
multiplication to convert a vector representation of a vector
(v) relative to
one basis ({ρ}_{B}\left (v\right ))
to a representation of the same vector relative to a second basis
({ρ}_{C}\left (v\right )).

Theorem ICBMInverse of Change-of-Basis Matrix Suppose that V is a
vector space, and B
and C are bases
of V . Then the
change-of-basis matrix {C}_{B,C}
is nonsingular and

We formed two representations of the vector
u above, so
we can again provide a check on our computations by converting from the representation
of u relative
to C to the
representation of u
relative to B,

One more computation that is either a check on our work, or
an illustration of a theorem. The two change-of-basis matrices,
{C}_{B,C} and
{C}_{C,B},
should be inverses of each other, according to Theorem ICBM. Here we go,

The computations of the previous example are not meant to present any
labor-saving devices, but instead are meant to illustrate the utility of
the change-of-basis matrix. However, you might have noticed that
{C}_{C,B} was easier to
compute than {C}_{B,C}. If you
needed {C}_{B,C}, then you
could first compute {C}_{C,B}
and then compute its inverse, which by Theorem ICBM, would equal
{C}_{B,C}.

Here’s another illustrative example. We have been concentrating on working with
abstract vector spaces, but all of our theorems and techniques apply just as well
to {ℂ}^{m},
the vector space of column vectors. We only need to use more complicated
bases than the standard unit vectors (Theorem SUVB) to make things
interesting.

We could continue further with this example, perhaps by computing the representation
of y relative
to the basis C
directly as a check on our work (Exercise CB.C20). Or we could choose another vector to
play the role of y
and compute two different representations of this vector relative to the two bases
B and
C.
⊠

Subsection MRS: Matrix Representations and Similarity

Here is the main theorem of this section. It looks a bit involved at first glance,
but the proof should make you realize it is not all that complicated. In any event,
we are more interested in a special case.

The other change-of-basis matrix we’ll compute is
{C}_{E,D}. However,
since E is a nice
basis (and D
is not) we’ll turn it around and instead compute
{C}_{D,E}
and apply Theorem ICBM to use an inverse to compute
{C}_{E,D}.

This is the third surprise of this chapter. Theorem SCB considers the special case
where a linear transformation has the same vector space for the domain and codomain
(V ). We build a matrix
representation of T
using the basis B
simultaneously for both the domain and codomain
({M}_{B,B}^{T }),
and then we build a second matrix representation of
T, now using the basis
C for both the domain
and codomain ({M}_{C,C}^{T }).
Then these two representations are related via a similarity
transformation (Definition SIM) using a change-of-basis matrix
({C}_{B,C})!

Check that B
is a basis of {M}_{22}
by first establishing the linear independence of
B and then
employing Theorem G to get the spanning property easily. Here is a second set of
2 × 2 matrices, which
also forms a basis of {M}_{22}
(Example BM),

We can build two matrix representations of
T, one relative
to B and one
relative to C.
Each is easy, but for wildly different reasons. In our computation of the matrix representation
relative to B
we borrow some of our work in Example ELTBM. Here are the representations,
then the explanation.

Not quite as pretty. The purpose of this example is to illustrate
Theorem SCB. This theorem says that the two matrix representations,
{M}_{B,B}^{T } and
{M}_{C,C}^{T }, of the one linear
transformation, T,
are related by a similarity transformation using the change-of-basis matrix
{C}_{B,C}.
Lets compute this change-of-basis matrix. Notice that since
C is
such a nice basis, this is fairly straightforward,

This should look and feel exactly like the process for diagonalizing
a matrix, as was described in Section SD. And it is.
⊠

We can now return to the question of computing an eigenvalue or
eigenvector of a linear transformation. For a linear transformation of the form
T : V → V , we
know that representations relative to different bases are similar matrices. We
also know that similar matrices have equal characteristic polynomials by
Theorem SMEE. We will now show that eigenvalues of a linear transformation
T
are precisely the eigenvalues of any matrix representation of
T.
Since the choice of a different matrix representation leads to a similar matrix,
there will be no “new” eigenvalues obtained from this second representation.
Similarly, the change-of-basis matrix can be used to show that eigenvectors
obtained from one matrix representation will be precisely those obtained
from any other representation. So we can determine the eigenvalues and
eigenvectors of a linear transformation by forming one matrix representation,
using any basis we please, and analyzing the matrix in the manner of
Chapter E.

Theorem EEREigenvalues, Eigenvectors, Representations Suppose that T : V → V is a linear
transformation and B
is a basis of V .
Then v ∈ V is an
eigenvector of T for
the eigenvalue λ if
and only if {ρ}_{B}\left (v\right ) is an
eigenvector of {M}_{B,B}^{T } for
the eigenvalue λ.
□

Proof ( ⇒)
Assume that v ∈ V is an
eigenvector of T for
the eigenvalue λ.
Then

which by Definition EELT says v
is an eigenvector of T
for the eigenvalue λ.
■

Subsection CELT: Computing Eigenvectors of Linear Transformations

Knowing that the eigenvalues of a linear transformation are the
eigenvalues of any representation, no matter what the choice of the basis
B might
be, we could now unambiguously define items such as the characteristic
polynomial of a linear transformation, rather than a matrix. We’ll say that again
— eigenvalues, eigenvectors, and characteristic polynomials are intrinsic
properties of a linear transformation, independent of the choice of a basis used to
construct a matrix representation.

As a practical matter, how does one compute the eigenvalues
and eigenvectors of a linear transformation of the form
T : V → V ? Choose a
nice basis B
for V ,
one where the vector representations of the values of the linear transformations
necessary for the matrix representation are easy to compute. Construct the matrix
representation relative to this basis, and find the eigenvalues and eigenvectors of
this matrix using the techniques of Chapter E. The resulting eigenvalues of the
matrix are precisely the eigenvalues of the linear transformation. The eigenvectors
of the matrix are column vectors that need to be converted to vectors in
V through
application of {ρ}_{B}^{−1}.

Now consider the case where the matrix representation of a linear transformation is
diagonalizable. The n
linearly independent eigenvectors that must exist for the matrix (Theorem DC) can be
converted (via {ρ}_{B}^{−1})
into eigenvectors of the linear transformation. A matrix representation of the
linear transformation relative to a basis of eigenvectors will be a diagonal matrix
— an especially nice representation! Though we did not know it at the time, the
diagonalizations of Section SD were really finding especially pleasing matrix
representations of linear transformations.

To find the eigenvalues and eigenvectors of
S
we will build a matrix representation and analyze the matrix. Since
Theorem EER places no restriction on the choice of the basis
B, we
may as well use a basis that is easy to work with. So set

According to Theorem EER the eigenvectors just listed as basis vectors for the eigenspaces of
M are vector representations
(relative to B) of eigenvectors
for S. So the application
if the inverse function {ρ}_{B}^{−1}
will convert these column vectors into elements of the vector space
{M}_{22}
(2 × 2 matrices) that are
eigenvectors of S. Since
{ρ}_{B} is an isomorphism
(Theorem VRILT), so is {ρ}_{B}^{−1}.
Applying the inverse function will then preserve linear independence and spanning
properties, so with a sweeping application of the Coordinatization Principle and
some extensions of our previous notation for eigenspaces and geometric
multiplicities, we can write,

At this point you should have computed enough matrix
representations to predict that the result of representing
S relative
to C
will be a diagonal matrix. Computing this representation is an example of how
Theorem SCB generalizes the diagonalizations from Section SD. For the record,
here is the diagonal representation,

Our interest in this example is not necessarily building nice representations,
but instead we want to demonstrate how eigenvalues and eigenvectors
are an intrinsic property of a linear transformation, independent of any
particular representation. To this end, we will repeat the foregoing, but replace
B by
another basis. We will make this basis different, but not extremely so,

Of course this is not news. We now know that
M = {M}_{B,B}^{S} and
N = {M}_{D,D}^{S} are
similar matrices (Theorem SCB). But Theorem SMEE told us long ago that similar
matrices have identical characteristic polynomials. Now compute eigenvectors for
the matrix representation, which will be different than what we found for
M,

However, the eigenspace for λ = −2
would at first glance appear to be different. Here are the two eigenspaces for
λ = −2, first the eigenspace
obtained from M = {M}_{B,B}^{S},
then followed by the eigenspace obtained from
M = {M}_{D,D}^{S}.

Subspaces generally have many bases, and that is the situation here. With a
careful proof of set equality, you can show that these two eigenspaces are equal
sets. The key observation to make such a proof go is that

which will establish that the second set is a subset of the first. With equal
dimensions, Theorem EDYES will finish the task. So the eigenvalues of a linear
transformation are independent of the matrix representation employed to compute
them! ⊠

C30 Contributed by Robert BeezerStatement [1820]With the domain and codomain being identical, we will build a matrix
representation using the same basis for both the domain and codomain. The
eigenvalues of the matrix representation will be the eigenvalues of the linear
transformation, and we can obtain the eigenvectors of the linear transformation
by un-coordinatizing (Theorem EER). Since the method does not depend on
which basis we choose, we can choose a natural basis for ease of computation, say,

C40 Contributed by Robert BeezerStatement [1820]Begin with a matrix representation of
R,
any matrix representation, but use the same basis for both instances of
{S}_{22}. We’ll
choose a basis that makes it easy to compute vector representations in
{S}_{22}.

The three vectors that occur as basis elements for these eigenspaces will together
form a linearly independent set (check this!). So these column vectors may be
employed in a matrix that will diagonalize the matrix representation. If
we “un-coordinatize” these three column vectors relative to the basis
B,
we will find three linearly independent elements of
{S}_{22}
that are eigenvectors of the linear transformation
R (Theorem EER).
A matrix representation relative to this basis of eigenvectors will be diagonal, with the
eigenvalues (λ = 2,\kern 1.95872pt 1)
as the diagonal elements. Here we go,

Because the three eigenvalues are distinct, the three basis vectors from the three
eigenspaces for a linearly independent set (Theorem EDELI). Theorem EER
says we can uncoordinatize these eigenvectors to obtain eigenvectors of
Q. By
Theorem ILTLI the resulting set will remain linearly independent. Set

which says that {1\over
λ}
is an eigenvalue of {T}^{−1}
with eigenvector v.
Note that it is possible to prove that any eigenvalue of an invertible
linear transformation is never zero. So the hypothesis that
λ be
nonzero is just a convenience for this problem.