Section IS Invariant Subspaces

We have seen in Section NLT that nilpotent linear transformations are almost
never diagonalizable (Theorem DNLT), yet have matrix representations that
are very nearly diagonal (Theorem CFNLT). Our goal in this section,
and the next (Section JCF), is to obtain a matrix representation of any
linear transformation that is very nearly diagonal. A key step in reaching
this goal is an understanding of invariant subspaces, and a particular
type of invariant subspace that contains vectors known as “generalized
eigenvectors.”

Subsection IS: Invariant Subspaces

As is often the case, we start with a definition.

Definition ISInvariant Subspace Suppose that T : V \mathrel{↦}V is a linear
transformation and W
is a subspace of V .
Suppose further that T\left (w\right ) ∈ W
for every w ∈ W. Then
W is an invariantsubspace of V
relative to T.
△

We do not have any special notation for an invariant subspace, so it is important
to recognize that an invariant subspace is always relative to both a superspace
(V ) and a linear
transformation (T),
which will sometimes not be mentioned, yet will be clear from the context. Note
also that the linear transformation involved must have an equal domain and
codomain — the definition would not make much sense if our outputs were not of
the same type as our inputs.

As usual, we begin with an example that demonstrates the existence of
invariant subspaces. We will return later to understand how this example was
constructed, but for now, just understand how we check the existence of the
invariant subspaces.

Example TISTwo invariant subspaces Consider the linear transformation T : {ℂ}^{4}\mathrel{↦}{ℂ}^{4}
defined by T\left (x\right ) = Ax
where A is
given by

and set W = \left \langle \left \{{w}_{1},\kern 1.95872pt {w}_{2}\right \}\right \rangle . We verify that
W is an invariant subspace
of {ℂ}^{4} with respect to
T. By the definition of
W, any vector chosen from
W can be written as a
linear combination of {w}_{1}
and {w}_{2}. Suppose
that w ∈ W,
and then check the details of the following verification,

and set X = \left \langle \left \{{x}_{1},\kern 1.95872pt {x}_{2}\right \}\right \rangle . We verify that
X is an invariant subspace
of {ℂ}^{4} with respect to
T. By the definition of
X, any vector chosen from
X can be written as a
linear combination of {x}_{1}
and {x}_{2}. Suppose
that x ∈ X,
and then check the details of the following verification,

So, by Definition IS, X is
an invariant subspace of {ℂ}^{4}
relative to T.

There is a bit of magic in each of these verifications where the two outputs of
T
happen to equal linear combinations of the two inputs. But this is the essential
nature of an invariant subspace. We’ll have a peek under the hood later, and it
won’t look so magical after all.

As a hint of things to come, verify that
B = \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {x}_{1},\kern 1.95872pt {x}_{2}\right \} is a
basis of {ℂ}^{4}.
Splitting this basis in half, Theorem DSFB, tells us that
{ℂ}^{4} = W ⊕ X. To
see why a decomposition of a vector space into a direct sum of invariant
subspaces might be interesting, construct the matrix representation of
T relative
to B,
{M}_{B,B}^{T }. Hmmmmmm.
⊠

Example TIS is a bit mysterious at this stage. Do we know any other
examples of invariant subspaces? Yes, as it turns out, we have already seen quite
a few. We’ll give some examples now, and in more general situations,
describe broad classes of invariant subspaces with theorems. First up is
eigenspaces.

Theorem EIS is general enough to determine that an entire eigenspace is an
invariant subspace, or that simply the span of a single eigenvector is an
invariant subspace. It is not always the case that any subspace of an invariant
subspace is again an invariant subspace, but eigenspaces do have this
property. Here is an example of the theorem, which also allows us to very
quickly build several several invariant (4x4, 2 evs, 1 2x2 jordan, 1 2x2
diag)

Build a matrix representation of S
relative to the standard basis (Definition MR, Example BM) and compute eigenvalues and
eigenspaces of S
with the computational techniques of Chapter E in concert with Theorem EER.
Then

So by Theorem EIS, both {ℰ}_{S}\left (1\right )
and {ℰ}_{S}\left (2\right ) are invariant
subspaces of {M}_{22}
relative to S.
However, Theorem EIS provides even more invariant subspaces. Since
{ℰ}_{S}\left (1\right )
has dimension 1, it has no interesting subspaces, however
{ℰ}_{S}\left (2\right ) has
dimension 2 and has a plethora of subspaces. For example, set

and define U = \left \langle \left \{u\right \}\right \rangle . Then
since U is a subspace
of {ℰ}_{S}\left (2\right ), Theorem EIS
says that U is an
invariant subspace of {M}_{22}
(or we could check this claim directly based simply on the fact that
u is an
eigenvector of S).
⊠

For every linear transformation there are some obvious, trivial invariant subspaces.
Suppose that T : V \mathrel{↦}V
is a linear transformation. Then simply because
T is a function (Definition LT),
the subspace V is an
invariant subspace of T.
In only a minor twist on this theme, the range of
T,
ℛ\kern -1.95872pt \left (T\right ), is an invariant
subspace of T
by Definition RLT. Finally, Theorem LTTZZ provides the justification for claiming that
\left \{0\right \} is an invariant
subspace of T.

That the trivial subspace is always an invariant subspace is a special
case of the next theorem. As an easy exercise before reading the next
theorem, prove that the kernel of a linear transformation (Definition KLT),
K\kern -1.95872pt \left (T\right ), is an
invariant subspace. We’ll wait.

Two interesting special cases of Theorem KPIS occur when choose
k = 0 and
k = 1.
Rather than give an example of this theorem, we will refer you back to
Example KPNLT where we work with null spaces of the first four powers of a
nilpotent matrix. By Theorem KPIS each of these null spaces is an invariant
subspace of the associated linear transformation.

Here’s one more example of invariant subspaces we have encountered
previously.

Example ISJBInvariant subspaces and Jordan blocks Refer back to Example CFNLT. We decomposed the vector space
{ℂ}^{6} into a direct sum
of the subspaces {Z}_{1},\kern 1.95872pt {Z}_{2},\kern 1.95872pt {Z}_{3},\kern 1.95872pt {Z}_{4}.
The union of the basis vectors for these subspaces is a basis of
{ℂ}^{6}, which
we reordered prior to building a matrix representation of the linear transformation
T. A
principal reason for this reordering was to create invariant subspaces (though it
was not obvious then).

Recall from the proof of Theorem CFNLT or the computations in Example CFNLT that
first elements of {X}_{1}
and {X}_{2} are in
the kernel of T,
K\kern -1.95872pt \left (T\right ), and each
element of {X}_{1}
and {X}_{2} is the
output of T
when evaluated with the subsequent element of the set. This was by design, and it
is this feature of these basis vectors that leads to the nearly diagonal
matrix representation with Jordan blocks. However, we also recognize now
that this property of these basis vectors allow us to conclude easily that
{X}_{1} and
{X}_{2} are invariant
subspaces of {ℂ}^{6}
relative to T.

Furthermore, {ℂ}^{6} = {X}_{
1} ⊕ {X}_{2}
(Theorem DSFB). So the domain of T
is the direct sum of invariant subspaces and the resulting
matrix representation has a block diagonal form. Hmmmmm.
⊠

Subsection GEE: Generalized Eigenvectors and Eigenspaces

We now define a new type of invariant subspace and explore its key
properties. This generalization of eigenvalues and eigenspaces will allow us to
move from diagonal matrix representations of diagonalizable matrices to
nearly diagonal matrix representations of arbitrary matrices. Here are the
definitions.

So the generalized eigenspace is composed of generalized eigenvectors, plus the
zero vector. As the name implies, the generalized eigenspace is a subspace of
V .
But more topically, it is an invariant subspace of
V relative
to T.

Now we show that {G}_{T }\left (λ\right )
is invariant relative to T.
Suppose that x ∈{G}_{T }\left (λ\right ).
Then by Definition GES there is an integer
k such
that {\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0.
The following argument is due to Zoltan Toth.

Before we compute some generalized eigenspaces, we state and prove one theorem
that will make it much easier to create a generalized eigenspace, since it will allow us to
use tools we already know well, and will remove some the ambiguity of the clause “for
some k” in
the definition.

Proof The conclusion of this theorem is a set equality, so we will apply
Definition SE by establishing two set inclusions. First, suppose that
x ∈{G}_{T }\left (λ\right ). Then there is an
integer k such that
{\left (T − λ{I}_{V }\right )}^{k}\left (x\right ) = 0. This is equivalent to
the statement that x ∈K\kern -1.95872pt \left ({\left (T − λ{I}_{V }\right )}^{k}\right ). No
matter what the value of k
is, Theorem KPLT gives

Theorem GEK allows us to compute generalized eigenspaces as a single kernel
(or null space of a matrix representation) with tools like Theorem KNSI
and Theorem BNS. Also, we do not need to consider all possible powers
k and can simply
consider the case where k = n.
It is worth noting that the “regular” eigenspace is a subspace of the generalized
eigenspace since

where the subset inclusion is a consequence of Theorem KPLT.
Also, there is no such thing as a “generalized eigenvalue.” If
λ is not an eigenvalue
of T, then the kernel of
T − λ{I}_{V } is trivial and therefore
subsequent powers of T − λ{I}_{V }
also have trivial kernels (Theorem KPLT). So the generalized eigenspace of a
scalar that is not already an eigenvalue would be trivial. Alright, we know enough
now to compute some generalized eigenspaces. We will record some information
about algebraic and geometric multiplicities of eigenvalues (Definition AME,
Definition GME) as we go, since these observations will be of interest in light of
some future theorems.

Example GE4Generalized eigenspaces, dimension 4 domain In Example TIS we presented two invariant subspaces of
{ℂ}^{4}.
There was some mystery about just how these were constructed, but we can
now reveal that they are generalized eigenspaces. Example TIS featured
T : {ℂ}^{4}\mathrel{↦}{ℂ}^{4} defined
by T\left (x\right ) = Ax
with A
given by

A matrix representation of T
relative to the standard basis (Definition SUV) will equal
A. So we can
analyze A
with the techniques of Chapter E. Doing so, we find two eigenvalues,
λ = 1,\kern 1.95872pt − 2, with
multiplicities,

To apply Theorem GEK we subtract each eigenvalue from the diagonal entries of
A, raise the result
to the power \mathop{ dim}\nolimits \left ({ℂ}^{4}\right ) = 4,
and compute a basis for the null space.

This is no accident. Notice that the dimension of each of these invariant
subspaces is equal to the algebraic multiplicity of the associated
eigenvalue. Not an accident either. (See the upcoming Theorem GESD.)
⊠

To find the generalized eigenspaces of
S
we need to subtract an eigenvalue from the diagonal elements of
B, raise the result
to the power \mathop{ dim}\nolimits \left ({ℂ}^{6}\right ) = 6
and compute the null space. Here are the results for the two eigenvalues of
S,

You can check that this set is linearly independent (right now we have no
guarantee this will happen). Once this is verified, we have a linearly independent
set of size 6 inside a vector space of dimension 6, so by Theorem G, the set
C is a basis
for {ℂ}^{6}.
This is enough to apply Theorem DSFB and conclude that

This is no accident. Notice that the dimension of each of these invariant
subspaces is equal to the algebraic multiplicity of the associated
eigenvalue. Not an accident either. (See the upcoming Theorem GESD.)
⊠

Subsection RLT: Restrictions of Linear Transformations

Generalized eigenspaces will prove to be an important type of invariant
subspace. A second reason for our interest in invariant subspaces is they provide
us with another method for creating new linear transformations from old
ones.

Definition LTRLinear Transformation Restriction Suppose that T : V \mathrel{↦}V is a linear
transformation, and U is
an invariant subspace of V
relative to T. Define
the restriction of T
to U
by

It might appear that this definition has not accomplished anything, as
T{|}_{U}
would appear to take on exactly the same values as
T. And this is
true. However, T{|}_{U}
differs from T
in the choice of domain and codomain. We tend to give little attention to the
domain and codomain of functions, while their defining rules get the spotlight.
But the restriction of a linear transformation is all about the choice of domain
and codomain. We are restricting the rule of the function to a smaller
subspace. Notice the importance of only using this construction with
an invariant subspace, since otherwise we cannot be assured that the
outputs of the function are even contained in the codomain. Maybe this
observation should be the key step in the proof of a theorem saying that
T{|}_{U} is
also a linear transformation, but we won’t bother.

Example LTRGELinear transformation restriction on generalized eigenspace In order to gain some experience with restrictions of linear transformations, we
construct one and then also construct a matrix representation for the restriction.
Furthermore, we will use a generalized eigenspace as the invariant subspace for
the construction of the restriction.

One of the eigenvalues of A
is λ = 2, with geometric
multiplicity {γ}_{T }\left (2\right ) = 1, and
algebraic multiplicity {α}_{T }\left (2\right ) = 3.
We get the generalized eigenspace in the usual manner,

By Theorem GESIS, we know W
is invariant relative to T,
so we can employ Definition LTR to form the restriction,
T{|}_{W }: W\mathrel{↦}W.

To better understand exactly what a restriction is (and isn’t), we’ll form a matrix
representation of T{|}_{W }.
This will also be a skill we will use in subsequent examples. For a basis of
W we will use
C = \left \{{w}_{1},\kern 1.95872pt {w}_{2},\kern 1.95872pt {w}_{3}\right \}. Notice
that \mathop{ dim}\nolimits \left (W\right ) = 3,
so our matrix representation will be a square matrix of size 3. Applying
Definition MR, we compute

It was necessary to verify that w ∈{G}_{T }\left (2\right ),
and if we trust our work so far, then this output will also be an element of
W, but
it would be wise to check this anyway (using either of the methods we used for
w).
We’ll wait.

Now we will repeat this sample computation, but instead using the matrix representation
of T{|}_{W } relative
to C.

which matches the previous computation. Notice how the “action” of
T{|}_{W } is accomplished
by a 3 × 3
matrix multiplying a column vector of size 3. If you would like more practice with
these sorts of computations, mimic the above using the other eigenvalue of
T, which is
λ = −2.
The generalized eigenspace has dimension 2, so the matrix
representation of the restriction to the generalized eigenspace will be a
2 × 2 matrix.
⊠

Suppose that T : V \mathrel{↦}V
is a linear transformation and we can find a decomposition of
V as a direct sum,
say V = {U}_{1} ⊕ {U}_{2} ⊕ {U}_{3} ⊕\mathrel{⋯} ⊕ {U}_{m} where each
{U}_{i} is an invariant
subspace of V relative
to T. Then, for any
v ∈ V there is a unique
decomposition v = {u}_{1} + {u}_{2} + {u}_{3} + \mathrel{⋯} + {u}_{m}
with {u}_{i} ∈ {U}_{i},
1 ≤ i ≤ m and
furthermore

So in a very real sense, we obtain a decomposition of the linear transformation
T into the
restrictions T{|}_{{U}_{i}},
1 ≤ i ≤ m. If we wanted
to be more careful, we could extend each restriction to a linear transformation defined on
V by setting the output
of T{|}_{{U}_{i}} to be the zero vector
for inputs outside of {U}_{i}.
Then T
would be exactly equal to the sum (Definition LTA) of these extended
restrictions. However, the irony of extending our restrictions is more than we
could handle right now.

Our real interest is in the matrix representation of a linear transformation when
the domain decomposes as a direct sum of invariant subspaces. Consider forming a
basis B of
V as the union
of bases {B}_{i} from
the individual {U}_{i},
i.e. B = {∪}_{i=1}^{m}\kern 1.95872pt {B}_{
i}. Now form the
matrix representation of T
relative to B. The
result will be block diagonal, where each block is the matrix representation of a restriction
T{|}_{{U}_{i}} relative to
a basis {B}_{i},
{M}_{{B}_{i},{B}_{i}}^{T{|}_{{U}_{i}}
}.
Though we did not have the definitions to describe it then, this is exactly what
was going on in the latter portion of the proof of Theorem CFNLT. Two
examples should help to clarify these ideas.

The interesting feature of this representation is the two
2 × 2
blocks on the diagonal that arise from the decomposition of
{ℂ}^{4} into a
direct sum (of generalized eigenspaces). Or maybe the interesting feature of this matrix
is the two 2 × 2
submatrices in the “other” corners that are all zero. You decide.
⊠

of {ℂ}^{6} where
\left \{{v}_{1},\kern 1.95872pt {v}_{2}\right \} is a basis
of {G}_{S}\left (3\right ) and
\left \{{v}_{3},\kern 1.95872pt {v}_{4},\kern 1.95872pt {v}_{5},\kern 1.95872pt {v}_{6}\right \} is a basis of
{G}_{S}\left (−1\right ). We can
employ C
in the construction of a matrix representation of
S
(Definition MR). Here are the computations,

As before, the key feature of this representation is the
2 × 2 and
4 × 4
blocks on the diagonal. We will discover in the final theorem of this section
(Theorem RGEN) that we already understand these blocks fairly well. For now,
we recognize them as arising from generalized eigenspaces and suspect that
their sizes are equal to the algebraic multiplicities of the eigenvalues.
⊠

The paragraph prior to these last two examples is worth repeating. A basis
derived from a direct sum decomposition into invariant subspaces will provide a
matrix representation of a linear transformation with a block diagonal
form.

Diagonalizing a linear transformation is the most extreme example
of decomposing a vector space into invariant subspaces. When a linear
transformation is diagonalizable, then there is a basis composed of eigenvectors
(Theorem DC). Each of these basis vectors can be used individually as the lone
element of a spanning set for an invariant subspace (Theorem EIS). So
the domain decomposes into a direct sum of one-dimensional invariant
subspaces (Theorem DSFB). The corresponding matrix representation is then
block diagonal with all the blocks of size 1, i.e. the matrix is diagonal.
Section NLT, Section IS and Section JCF are all devoted to generalizing this
extreme situation when there are not enough eigenvectors available to
make such a complete decomposition and arrive at such an elegant matrix
representation.

One last theorem will roll up much of this section and Section NLT into one
nice, neat package.

The proof of Theorem RGEN indicates that the index of the
nilpotent linear transformation is less than or equal to the dimension of
V . In practice,
it will be less than or equal to the dimension of the domain of the linear transformation,
{G}_{T }\left (λ\right ).
In any event, the exact value of this index will be of some interest,
so we define it now. Notice that this is a property of the eigenvalue
λ,
similar to the algebraic and geometric multiplicities (Definition AME,
Definition GME).

Definition IEIndex of an EigenvalueSuppose T : V \mathrel{↦}V is a linear
transformation with eigenvalue λ.
Then the index of λ,
{ι}_{T }\left (λ\right ),
is the index of the nilpotent linear transformation
T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )}.

The generalized eigenspace, {G}_{S}\left (3\right ),
has dimension 2,
while {G}_{S}\left (−1\right ), has
dimension 4.
We’ll investigate each thoroughly in turn, with the intent being to illustrate
Theorem RGEN. Much of our computations will be repeats of those done in
Example ISMR6.

For U = {G}_{S}\left (3\right ) we compute a
matrix representation of S{|}_{U}
using the basis found in Example GE6,

So N − (−1){I}_{4}
is a nilpotent matrix of index 3 (meaning that
S{|}_{W } − (−1){I}_{W } is a
nilpotent linear transformation of index 3) and according to Definition IE we say
{ι}_{S}\left (−1\right ) = 3.

Notice that if we were to take the union of the two bases
of the generalized eigenspaces, we would have a basis for
{ℂ}^{6}. Then a matrix
representation of S
relative to this basis would be the same block diagonal matrix
we found in Example ISMR6, only we now understand each of
these blocks as being very close to being a nilpotent matrix.
⊠

Invariant subspaces, and restrictions of linear transformations, are topics you
will see again and again if you continue with further study of linear algebra. Our
reasons for discussing them now is to arrive at a nice matrix representation of the
restriction of a linear transformation to one of its generalized eigenspaces. Here’s
the theorem.

Theorem MRRGEMatrix Representation of a Restriction to a Generalized Eigenspace Suppose that T : V \mathrel{↦}V is a linear
transformation with eigenvalue λ.
Then there is a basis of the the generalized eigenspace
{G}_{T }\left (λ\right ) such that the
restriction T{|}_{{G}_{T}\left (λ\right )}: {G}_{T }\left (λ\right )\mathrel{↦}{G}_{T }\left (λ\right )
has a matrix representation that is block diagonal where each block is a Jordan block
of the form {J}_{n}\left (λ\right ).
□

ProofTheorem RGEN tells us that
T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )} is a
nilpotent linear transformation. Theorem CFNLT tells us that a nilpotent linear
transformation has a basis for its domain that yields a matrix representation
that is block diagonal where the blocks are Jordan blocks of the form
{J}_{n}\left (0\right ). Let
B be a basis of
{G}_{T }\left (λ\right ) that yields such a matrix
representation for T{|}_{{G}_{T}\left (λ\right )} − λ{I}_{{G}_{T}\left (λ\right )}.

The matrix representation of λ{I}_{{G}_{T}\left (λ\right )}
relative to the basis B is then
simply the diagonal matrix λ{I}_{m},
where m =\mathop{ dim}\nolimits \left ({G}_{T }\left (λ\right )\right ).
By Theorem MRSLT we have the rather unwieldy expression,

The first of these matrix representations has Jordan blocks with zero
in every diagonal entry, while the second matrix representation has
λ in every
diagonal entry. The result of adding the two representations is to convert the Jordan blocks
from the form {J}_{n}\left (0\right )
to the form {J}_{n}\left (λ\right ).
■

Of course, Theorem CFNLT provides some extra information on the sizes of
the Jordan blocks in a representation and we could carry over this information to
Theorem MRRGE, but will save that for a subsequent application of this
result.

Subsection EXC: Exercises

T10 Suppose that T : V \mathrel{↦}V is
linear transformation, and p(x)
is a polynomial. Then define the new linear transformation
p(T): V \mathrel{↦}V
by interpreting the coefficients of the terms of the polynomial as scalar
mutliples of linear transformations (Definition LTSM), addition of terms
as the sum of linear transformations (Definition LTA), and powers as
repeated composition of linear transformations (Definition LTC). Prove that
T ∘ p(T) = p(T) ∘ T.

Use this observation to give a shorter argument for the proof of the invariance
of the generalized eigenspace in Theorem GESIS. Contributed by Robert Beezer