Section EE Eigenvalues and Eigenvectors

Subsection EEM: Eigenvalues and Eigenvectors of a Matrix

Definition EEMEigenvalues and Eigenvectors of a Matrix Suppose that A is a
square matrix of size n,
x\mathrel{≠}0 is a vector
in {ℂ}^{n}, and
λ is a scalar
in {ℂ}^{}. Then we
say x is an
eigenvector of A
with eigenvalue λ
if

Ax = λx

△

Before going any further, perhaps we should convince you that such things
ever happen at all. Understand the next example, but do not concern yourself
with where the pieces come from. We will have methods soon enough to be able to
discover these eigenvectors ourselves.

The vectors z
and w are both
eigenvectors of A for
the same eigenvalue λ = 2,
yet this is not as simple as the two vectors just being scalar multiples of each
other (they aren’t). Look what happens when we add them together, to form
v = z + w, and
multiply by A,

so that v is also an
eigenvector of A for
the eigenvalue λ = 2.
So it would appear that the set of eigenvectors that are associated
with a fixed eigenvalue is closed under the vector space operations of
{ℂ}^{n}.
Hmmm.

The vector y is an
eigenvector of A for the
eigenvalue λ = 0, so we can use
Theorem ZSSM to write Ay = 0y = 0.
But this also means that y ∈N\kern -1.95872pt \left (A\right ).
There would appear to be a connection here also.
⊠

Example SEE hints at a number of intriguing properties, and there are many
more. We will explore the general properties of eigenvalues and eigenvectors in
Section PEE, but in this section we will concern ourselves with the question of
actually computing eigenvalues and eigenvectors. First we need a bit of
background material on polynomials and matrices.

Subsection PM: Polynomials and Matrices

A polynomial is a combination of powers, multiplication by scalar coefficients,
and addition (with subtraction just being the inverse of addition). We never have
occasion to divide when computing the value of a polynomial. So it is with
matrices. We can add and subtract matrices, we can multiply matrices by scalars,
and we can form powers of square matrices by repeated applications of matrix
multiplication. We do not normally divide matrices (though sometimes
we can multiply by an inverse). If a matrix is square, all the operations
constituting a polynomial will preserve the size of the matrix. So it is natural
to consider evaluating a polynomial with a matrix, effectively replacing
the variable of the polynomial by a matrix. We’ll demonstrate with an
example,

Because D commutes
with itself (DD = DD), we
can use distributivity of matrix multiplication across matrix addition (Theorem MMDAA)
without being careful with any of the matrix products, and just as easily evaluate
p(D) using the
factored form of p(x),

This example is not meant to be too profound. It is meant to show you that it is
natural to evaluate a polynomial with a matrix, and that the factored form of the
polynomial is as good as (or maybe better than) the expanded form. And do
not forget that constant terms in polynomials are really multiples of the
identity matrix when we are evaluating the polynomial with a matrix.
⊠

Subsection EEE: Existence of Eigenvalues and Eigenvectors

Before we embark on computing eigenvalues and eigenvectors, we will prove
that every matrix has at least one eigenvalue (and an eigenvector to go with it).
Later, in Theorem MNEM, we will determine the maximum number of
eigenvalues a matrix may have.

The determinant (Definition D) will be a powerful tool in Subsection EE.CEE
when it comes time to compute eigenvalues. However, it is possible, with some
more advanced machinery, to compute eigenvalues without ever making use of the
determinant. Sheldon Axler does just that in his book, Linear Algebra DoneRight. Here and now, we give Axler’s “determinant-free” proof that every matrix
has an eigenvalue. The result is not too startling, but the proof is most
enjoyable.

This is a set of n + 1
vectors from {ℂ}^{n}, so by
Theorem MVSLD, S is
linearly dependent. Let {a}_{0},\kern 1.95872pt {a}_{1},\kern 1.95872pt {a}_{2},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {a}_{n}
be a collection of n + 1
scalars from {ℂ}^{},
not all zero, that provide a relation of linear dependence on
S. In
other words,

Some of the {a}_{i} are
nonzero. Suppose that just {a}_{0}\mathrel{≠}0,
and {a}_{1} = {a}_{2} = {a}_{3} = \mathrel{⋯} = {a}_{n} = 0. Then
{a}_{0}x = 0 and by
Theorem SMEZV, either {a}_{0} = 0
or x = 0, which are both
contradictions. So {a}_{i}\mathrel{≠}0
for some i ≥ 1. Let
m be the largest integer
such that {a}_{m}\mathrel{≠}0. From this
discussion we know that m ≥ 1.
We can also assume that {a}_{m} = 1,
for if not, replace each {a}_{i}
by {a}_{i}∕{a}_{m} to
obtain scalars that serve equally well in providing a relation of linear dependence
on S.

Because we have consistently used
{ℂ}^{} as our set of scalars
(rather than ℝ), we know
that we can factor p(x) into
linear factors of the form (x − {b}_{i}),
where {b}_{i} ∈ {ℂ}^{}. So there
are scalars, {b}_{1},\kern 1.95872pt {b}_{2},\kern 1.95872pt {b}_{3},\kern 1.95872pt \mathop{\mathop{…}},\kern 1.95872pt {b}_{m},
from {ℂ}^{}
so that,

Since z\mathrel{≠}0, this equation
says that z is an
eigenvector of A
for the eigenvalue λ = {b}_{k}
(Definition EEM), so we have shown that any square matrix
A does have at least
one eigenvalue. ■

The proof of Theorem EMHE is constructive (it contains an unambiguous
procedure that leads to an eigenvalue), but it is not meant to be practical. We will
illustrate the theorem with an example, the purpose being to provide a companion
for studying the proof and not to suggest this is the best procedure for computing
an eigenvalue.

Example CAEHWComputing an eigenvalue the hard way This example illustrates the proof of Theorem EMHE, so will employ
the same notation as the proof — look there for full explanations. It is
not meant to be an example of a reasonable computational approach
to finding eigenvalues and eigenvectors. OK, warnings in place, here we
go.

It is important to notice that the choice of
x
could be anything, so long as it is not the zero vector. We have not chosen
x
totally at random, but so as to make our illustration of the theorem as
general as possible. You could replicate this example with your own choice
and the computations are guaranteed to be reasonable, provided you
have a computational tool that will factor a fifth degree polynomial for
you.

is guaranteed to be linearly dependent, as it has six vectors from
{ℂ}^{5}
(Theorem MVSLD). We will search for a non-trivial relation of linear dependence by
solving a homogeneous system of equations whose coefficient matrix has the vectors
of S as
columns through row operations,

There are four free variables for describing solutions to this homogeneous system,
so we have our pick of solutions. The most expedient choice would be to set
{x}_{3} = 1 and
{x}_{4} = {x}_{5} = {x}_{6} = 0. However,
we will again opt to maximize the generality of our illustration of Theorem EMHE and
choose {x}_{3} = −8,
{x}_{4} = −3,
{x}_{5} = 1 and
{x}_{6} = 0. The leads to a
solution with {x}_{1} = 16
and {x}_{2} = 12.

So we define p(x) = 16 + 12x − 8{x}^{2} − 3{x}^{3} + {x}^{4},
and as advertised in the proof of Theorem EMHE, we have a polynomial of degree
m = 4 > 1 such that
p(A)x = 0. Now we need
to factor p(x) over
{ℂ}^{}. If you made your
own choice of x
at the start, this is where you might have a fifth degree polynomial, and where
you might need to use a computational tool to find roots and factors. We have

is an eigenvector of A
for the eigenvalue λ = −2,
as you can check by doing the computation
Az.
If you work through this example with your own choice of the vector
x (strongly
recommended) then the eigenvalue you will find may be different, but will be in the
set \left \{3,\kern 1.95872pt 0,\kern 1.95872pt 1,\kern 1.95872pt − 1,\kern 1.95872pt − 2\right \}.
See Exercise EE.M60 for a suggested starting vector.
⊠

Subsection CEE: Computing Eigenvalues and Eigenvectors

Fortunately, we need not rely on the procedure of Theorem EMHE each time
we need an eigenvalue. It is the determinant, and specifically Theorem SMZD,
that provides the main tool for computing eigenvalues. Here is an informal
sequence of equivalences that is the key to determining the eigenvalues and
eigenvectors of a matrix,

So, for an eigenvalue λ
and associated eigenvector x\mathrel{≠}0,
the vector x
will be a nonzero element of the null space of
A − λ{I}_{n}, while the
matrix A − λ{I}_{n}
will be singular and therefore have zero determinant. These ideas are made
precise in Theorem EMRCP and Theorem EMNS, but for now this brief
discussion should suffice as motivation for the following definition and
example.

Definition CPCharacteristic Polynomial Suppose that A is a
square matrix of size n.
Then the characteristic polynomial of
A is the
polynomial {p}_{A}\left (x\right )
defined by

to be {p}_{F }\left (x\right ) = −(x − 3){(x + 1)}^{2}.
Factored, we can find all of its roots easily, they are
x = 3 and
x = −1. By
Theorem EMRCP, λ = 3 and
λ = −1 are both eigenvalues
of F, and these are the
only eigenvalues of F.
We’ve found them all. ⊠

Let us now turn our attention to the computation of eigenvectors.

Definition EMEigenspace of a Matrix Suppose that A is a
square matrix and λ is
an eigenvalue of A. Then
the eigenspace of A
for λ,
{ℰ}_{A}\left (λ\right ), is the set of all
the eigenvectors of A
for λ,
together with the inclusion of the zero vector.
△

Example SEE hinted that the set of eigenvectors for a single eigenvalue might
have some closure properties, and with the addition of the non-eigenvector,
0, we
indeed get a whole subspace.

Theorem EMSEigenspace for a Matrix is a Subspace Suppose A is a square
matrix of size n and
λ is an eigenvalue of
A. Then the eigenspace
{ℰ}_{A}\left (λ\right ) is a subspace of
the vector space {ℂ}^{n}.
□

Proof We will check the three conditions of Theorem TSS.
First, Definition EM explicitly includes the zero vector in
{ℰ}_{A}\left (λ\right ), so
the set is non-empty.

Suppose that x,\kern 1.95872pt y ∈{ℰ}_{A}\left (λ\right ),
that is, x and
y are two
eigenvectors of A
for λ.
Then

Proof The conclusion of this theorem is an equality of sets, so
normally we would follow the advice of Definition SE. However, in
this case we can construct a sequence of equivalences which will
together provide the two subset inclusions we need. First, notice that
0 ∈{ℰ}_{A}\left (λ\right ) by Definition EM
and 0 ∈N\kern -1.95872pt \left (A − λ{I}_{n}\right )
by Theorem HSC. Now consider any nonzero vector
x ∈ {ℂ}^{n},

We will now take each eigenvalue in turn and compute its eigenspace. To do this, we row-reduce
the matrix F − λ{I}_{3}
in order to determine solutions to the homogeneous system
ℒS\kern -1.95872pt \left (F − λ{I}_{3},\kern 1.95872pt 0\right )
and then express the eigenspace as the null space of
F − λ{I}_{3}
(Theorem EMNS). Theorem BNS then tells us how to write the null space as the
span of a basis.

Eigenspaces in hand, we can easily compute eigenvectors by forming nontrivial
linear combinations of the basis vectors describing each eigenspace. In particular,
notice that we can “pretty up” our basis vectors by using scalar multiples to clear
out fractions. More powerful scientific calculators, and most every mathematical
software package, will compute eigenvalues of a matrix along with basis
vectors of the eigenspaces. Be sure to understand how your device outputs
complex numbers, since they are likely to occur. Also, the basis vectors will
not necessarily look like the results of an application of Theorem BNS.
Duplicating the results of the next section (Subsection EE.ECEE) with
your device would be very good practice. See: Computation E.SAGE⊠

Subsection ECEE: Examples of Computing Eigenvalues and Eigenvectors

No theorems in this section, just a selection of examples meant to illustrate
the range of possibilities for the eigenvalues and eigenvectors of a matrix. These
examples can all be done by hand, though the computation of the characteristic
polynomial would be very time-consuming and error-prone. It can also be difficult
to factor an arbitrary polynomial, though if we were to suggest that most of our
eigenvalues are going to be integers, then it can be easier to hunt for
roots. These examples are meant to look similar to a concatenation of
Example CPMS3, Example EMS3 and Example ESMS3. First, we will sneak in
a pair of definitions so we can illustrate them throughout this sequence of
examples.

Definition AMEAlgebraic Multiplicity of an Eigenvalue Suppose that A is a
square matrix and λ is an
eigenvalue of A. Then the
algebraic multiplicity of λ,
{α}_{A}\left (λ\right ), is the highest power
of (x − λ) that divides the
characteristic polynomial, {p}_{A}\left (x\right ).

Since an eigenvalue λ
is a root of the characteristic polynomial, there is always a factor of
(x − λ), and
the algebraic multiplicity is just the power of this factor in a factorization of
{p}_{A}\left (x\right ). So in
particular, {α}_{A}\left (λ\right ) ≥ 1.
Compare the definition of algebraic multiplicity with the next definition.

Definition GMEGeometric Multiplicity of an Eigenvalue Suppose that A is a square
matrix and λ is an eigenvalue
of A. Then the geometricmultiplicity of λ,
{γ}_{A}\left (λ\right ), is the dimension
of the eigenspace {ℰ}_{A}\left (λ\right ).

So each eigenspace has dimension 1 and so
{γ}_{B}\left (1\right ) = 1 and
{γ}_{B}\left (2\right ) = 1. This
example is of interest because of the discrepancy between the two multiplicities for
λ = 2. In many of our
examples the algebraic and geometric multiplicities will be equal for all of the eigenvalues
(as it was for λ = 1
in this example), so keep this example in mind. We will have some
explanations for this phenomenon later (see Example NDMS4).
⊠

So the eigenspace dimensions yield geometric multiplicities
{γ}_{C}\left (3\right ) = 1,
{γ}_{C}\left (1\right ) = 2 and
{γ}_{C}\left (−1\right ) = 1, the
same as for the algebraic multiplicities. This example is of interest because
A is
a symmetric matrix, and will be the subject of Theorem HMRE.
⊠

So the eigenspace dimensions yield geometric multiplicities
{γ}_{E}\left (2\right ) = 2 and
{γ}_{E}\left (−1\right ) = 1. This example is
of interest because λ = 2
has such a large algebraic multiplicity, which is also not equal to its geometric
multiplicity. ⊠

So the eigenspace dimensions yield geometric multiplicities
{γ}_{F }\left (2\right ) = 1,
{γ}_{F }\left (−1\right ) = 1,
{γ}_{F }\left (2 + i\right ) = 1 and
{γ}_{F }\left (2 − i\right ) = 1. This
example demonstrates some of the possibilities for the appearance of complex eigenvalues,
even when all the entries of the matrix are real. Notice how all the numbers in the
analysis of λ = 2 − i
are conjugates of the corresponding number in the analysis of
λ = 2 + i.
This is the content of the upcoming Theorem ERMCP.
⊠

C19 Find the eigenvalues, eigenspaces, algebraic multiplicities and geometric
multiplicities for the matrix below. It is possible to do all these computations by
hand, and it would be instructive to do so.

C20 Find the eigenvalues, eigenspaces, algebraic multiplicities and geometric
multiplicities for the matrix below. It is possible to do all these computations by
hand, and it would be instructive to do so.

T10 A matrix A
is idempotent if {A}^{2} = A.
Show that the only possible eigenvalues of an idempotent matrix are
λ = 0 and
λ = 1.
Then give an example of a matrix that is idempotent and has both of these two
values as eigenvalues. Contributed by Robert BeezerSolution [1285]

T15 The characteristic polynomial of the square matrix
A is usually
defined as {r}_{A}(x) =\mathop{ det} \left (x{I}_{n} − A\right ).
Find a specific relationship between our characteristic polynomial,
{p}_{A}\left (x\right ), and
{r}_{A}(x), give
a proof of your relationship, and use this to explain why Theorem EMRCP can
remain essentially unchanged with either definition. Explain the advantages of
each definition over the other. (Computing with both definitions, for a
2 × 2 and a
3 × 3
matrix, might be a good way to start.) Contributed by Robert BeezerSolution [1287]

T20 Suppose that λ
and ρ
are two different eigenvalues of the square matrix
A. Prove
that the intersection of the eigenspaces for these two eigenvalues is trivial. That is,
{ℰ}_{A}\left (λ\right ) ∩{ℰ}_{A}\left (ρ\right ) = \left \{0\right \}.
Contributed by Robert BeezerSolution [1288]

C21 Contributed by Robert BeezerStatement [1273]If λ = 2 is an
eigenvalue of A,
the matrix A − 2{I}_{4}
will be singular, and its null space will be the eigenspace of
A. So
we form this matrix and row-reduce,

With two free variables, we know a basis of the null space
(Theorem BNS) will contain two vectors. Thus the null space of
A − 2{I}_{4}
has dimension two, and so the eigenspace of
λ = 2 has dimension two also
(Theorem EMNS), {γ}_{A}\left (2\right ) = 2.

where the factorization can be obtained by finding the roots of
{p}_{B}\left (x\right ) = 0
with the quadratic equation. By Theorem EMRCP the eigenvalues of
B are the complex
numbers {λ}_{1} = {3+\sqrt{3}i\over
2}
and {λ}_{2} = {3−\sqrt{3}i\over
2} .

C26 Contributed by Chris BlackStatement [1274]Since we are given that the characteristic polynomial of
A is
{p}_{A}\left (x\right ) = (4 − x){(1 − x)}^{2}, we see that the
eigenvalues are λ = 4 with
algebraic multiplicity {α}_{A}\left (4\right ) = 1
and λ = 1 with algebraic
multiplicity {α}_{A}\left (1\right ) = 2.
The corresponding eigenspaces are

C27 Contributed by Chris BlackStatement [1274]Since we are given that the characteristic polynomial of
A is
{p}_{A}\left (x\right ) = (x + 2){(x − 2)}^{2}(x − 4), we see that the
eigenvalues are λ = −2,
λ = 2 and
λ = 4. The
eigenspaces are

The simplest possible relation of linear dependence on the columns of
C comes from using
scalars {α}_{4} = 1 and
{α}_{5} = {α}_{6} = 0 for the free variables in a
solution to ℒS\kern -1.95872pt \left (C,\kern 1.95872pt 0\right ). The remainder
of this solution is {α}_{1} = 3,
{α}_{2} = −1,
{α}_{3} = −3. This
solution gives rise to the polynomial

p(x) = 3 − x − 3{x}^{2} + {x}^{3} = (x − 3)(x − 1)(x + 1)

which then has the property that p(A)x = 0.

No matter how you choose to order the factors of
p(x), the
value of k
(in the language of Theorem EMHE and Example CAEHW) is
k = 2. For
each of the three possibilities, we list the resulting eigenvector and the associated
eigenvalue:

Since x is
an eigenvector, it is nonzero, and Theorem SMEZV leaves us with the conclusion
that {λ}^{2} − λ = 0,
and the solutions to this quadratic polynomial equation in
λ are
λ = 0 and
λ = 1.

The matrix

\left [\array{
1&0\cr
0&0 } \right ]

is idempotent (check this!) and since it is a diagonal matrix, its eigenvalues are the
diagonal entries, λ = 0
and λ = 1,
so each of these possible values for an eigenvalue of an idempotent matrix actually
occurs as an eigenvalue of some idempotent matrix. So we cannot state any
stronger conclusion about the eigenvalues of an idempotent matrix, and we can
say that this theorem is the “best possible.”

T15 Contributed by Robert BeezerStatement [1275]Note in the following that the scalar multiple of a matrix is equivalent to
multiplying each of the rows by that scalar, so we actually apply Theorem DRCM
multiple times below (and are passing up an opportunity to do a proof by
induction in the process, which maybe you’d like to do yourself?).

Since the polynomials are scalar multiples of each other, their roots will be
identical, so either polynomial could be used in Theorem EMRCP.

Computing by hand, our definition of the characteristic
polynomial is easier to use, as you only need to subtract
x
down the diagonal of the matrix before computing the determinant.
However, the price to be paid is that for odd values of
n, the
coefficient of {x}^{n}
is − 1, while
{r}_{A}(x) always has
the coefficient 1
for {x}^{n} (we
say {r}_{A}(x) is
“monic.”)

To show that {ℰ}_{A}\left (λ\right ) ∩{ℰ}_{A}\left (ρ\right ) ⊆\left \{0\right \},
suppose that x ∈{ℰ}_{A}\left (λ\right ) ∩{ℰ}_{A}\left (ρ\right ).
Then x is an
eigenvector of A
for both λ
and ρ
(Definition SI) and so