Inverses and Elementary Matrices

Matrix inversion gives a method for solving some systems of
equations. Suppose

is a system of n linear equations in n variables. Write

The system can then be written in matrix form:

(One reason for using matrix notation is that it saves writing!) If A
has an inverse , I can multiply both sides by
:

I've solved for the vectors x of unknowns.

Since not every matrix has an inverse, it's important to know:

When a matrix has an inverse.

How to find the inverse, if there is one.

I'll discuss these questions in this section.

Definition. An elementary
matrix is a matrix which represents an elementary row operation.
("Represents" means that multiplying on the left by the
elementary matrix performs the row operation.)

In the pictures below, the elements that are not shown are the same
as those in the identity matrix.

interchanges rows i and j.

multiplies row i by a.

replaces row i with row i plus a times row j.

Their inverses are the elementary matrices

respectively. Multiplication by the first matrix swaps rows i and j.
Multiplication by the second matrix divides row i by a.
Multiplication by the third matrix subtracts a times row j from row
i. These operations are the inverses of the operations implemented by
the original matrices.

Example. Multiplying on the left by

The inverse

Multiplying on the left by

The inverse

Multiplying on the left by

The inverse

Definition. Matrices A and B are row equivalent if A can be transformed to B by a
finite sequence of elementary row operations.

Remark. Since row operations may be
implemented by multiplying by elementary matrices, A and B are row
equivalent if and only if there are elementary matrices ,
..., such that

Lemma. Row equivalence is an equivalence relation.

Proof. I have to show three things:

(Reflexivity) Every matrix is row equivalent to itself.

(Symmetry) If A row reduces to B, then B row reduces to A.

(Transitivity) If A row reduces to B and B row reduces to C, then A
row reduces to C.

(a) is obvious, since I can row reduce a matrix to itself by
performing the identity row operation.

For (b), suppose A row reduces to B. Write

where , ... are elementary matrices. Then

Since the inverse of an elementary matrix is an elementary matrix, it
follows that B row reduces to A.

It remains to prove (c). If A row reduces to B and B row reduces to
C, then there are elementary matrices , ..., ,
, ..., such that

Then

so A row reduces to C.

Therefore, row equivalence is an equivalence relation.

Definition. An matrix A is invertible if there is an matrix B such
that , where I is the identity matrix.

Notation. If A is a square matrix, then

(where for only makes sense if A is invertible.

The usual rules for powers hold:

.

.

Example. Let

Compute and .

Using the formula for the inverse of a matrix,

Therefore,

Proposition.

If A and B are invertible matrices, then

If A is invertible, then .

Proof. (a) The inverse of is
the thing which, when multiplied by , gives the identity I.
Now

Since gives the identity when multiplied by ,
must be the inverse of --- that is, .

(b) The inverse of is the thing which, when
multiplied by , give the identity I. Now

Since gives the identity when multiplied by ,
must be the inverse of --- that is, .

Remark. Look over the proofs of the two
parts of the last proposition and be sure you understand why the
computations proved the things that were to be proved. The
idea is that the inverse of a matrix is defined by a
property, not by appearance. By analogy, it is like
the difference between the set of mathematicians (a set defined by a
property) and the set of people with purple hair (a set
defined by appearance).

A matrix B is the inverse of a matrix A if it has the
property that multiplying B by A (in both orders) gives the
identity I. So to check whether a matrix B really is the
inverse of A, you multiply B by A (in both orders) any see whether
you get I.

Example. Solve the following matrix equation
for X, assuming that A and B are invertible:

Notice that I can multiply both sides of a matrix equation by the
same thing, but I must multiply on the same side of both
sides. The reason I have to be careful is that in general, --- matrix multiplication is not commutative.

Example. Note that . In fact, if A and B are invertible,
need not be invertible. For example, if

then A and B are invertible --- each is its own inverse.

But

which is not invertible.

Theorem. Let A be an matrix. The following are equivalent:

A is row equivalent to I.

A is a product of elementary matrices.

A is invertible.

The system

has only the trivial solution .

For any n-dimensional vector b, the system

has a unique solution.

Proof. When you are trying to prove several
statements are equivalent, you must prove that if you assume
any one of the statements, you can prove any of the others. I can do
this here by proving that (a) implies (b), (b) implies (c), (c)
implies (d), (d) implies (e), and (e) implies (a).

(a) (b): Let be elementary matrices which
row reduce A to I:

Then

Since the inverse of an elementary matrix is an elementary matrix, A
is a product of elementary matrices.

(b) (c): Write A as a product of elementary matrices:

Now

Hence,

(c) (d): Suppose A is invertible. The system
has at least one solution, namely .

Moreover, if y is any other solution, then

That is, 0 is the one and only solution to the system.

(d) (e): Suppose the only solution to
is . If , this means that row reducing the
augmented matrix

Ignoring the last column (which never changes), this means there is a
sequence of row operations , ..., which reduces A to the
identity I --- that is, A is row equivalent to I. (I've actually
proved (d) (a) at this point.)

Let be an arbitrary
n-dimensional vector. Then

Thus, is a solution.

Suppose y is another solution to . Then

Therefore, is a solution to . But the only solution to
is 0, so , or . Thus, is the unique solution to .

(e) (a): Suppose has a unique solution for every b.
As a special case, has a unique solution (namely
). But arguing as I did in (d) (e), I can show
that A row reduces to I, and that is (a).

Example. ( Writing an
invertible matrix as a product of elementary matrices) If A is
invertible, the theorem implies that A can be written as a product of
elementary matrices. To do this, row reduce A to the identity,
keeping track of the row operations you're using. Write each row
operation as an elementary matrix, and express the row reduction as a
matrix multiplication. Finally, solve the resulting equation for A.

For example, suppose

Row reduce A to I:

Represent each row operation as an elementary matrix:

Write the row reduction as a matrix multiplication. A must be
multiplied on the left by the elementary matrices in the order in
which the operations were performed.

Now solve for A, being careful to get the inverses in the right
order:

Finally, write each inverse as an elementary matrix.

Corollary. If A and B are matrices and , then and .

Proof. Suppose A and B are matrices and . The system

certainly has as a solution. I'll show it's the only
solution.

Suppose y is another solution, so

Multiply both sides by A and simplify:

Thus, 0 is a solution, and it's the solution.

Thus, B satisfies condition (d) of the Theorem. Since the five
conditions are equivalent, B also satisfies condition (c), so B is
invertible. Let be the inverse of B. Then

This proves the first part of the Corollary. Finally,

This finishes the proof.

Remark. This result shows that if you're
checking that two square matrices A and B are inverses by multiplying
to get the identity, you only need to check --- is then automatic.

Algorithm. The proof provides an algorithm
for inverting a matrix A.

If are elementary matrices which row reduce A
to I, then

Then

That is, the row operations which reduce A to the identity also
transform the identity into .

Example. To invert

form the augmented matrix

Next, row reduce the augmented matrix. The row operations are
entirely determined by the block on the left, which is the original
matrix. The row operations turn the left block into the identity,
while simultaneously turning the identity on the right into the
inverse.

Thus,

Example. ( Inverting a
matrix over ) Find the inverse of the following
matrix over :

Therefore,

Proposition. Let F be a field, and let be a system of linear equations over F. Then:

If F is infinite, then the system has either no solutions, exactly
one solution, or infinitely many solutions.

If F is a finite field with elements, where p is prime and
, then the system has either no solutions, exactly
one solution, or at least solutions.

Proof. Suppose the system has more than one
solution. I must show that there are infinitely many solutions if F
is infinite, or at least solutions if F is a finite field
with elements.

Suppose then that there is more than one solution. Let
and be distinct solutions to , so

Note that

Since , is a nontrivial solution to
the system . Now if ,

Thus, is a solution to . Moreover, the
only way two solutions of the form can be
the same is if they have the same t. For

Now implies , a contradiction. Therefore, , so .

Thus, different t's give different 's,
each of which is a solution to the system.

If F has infinitely many elements, there are infinitely many
possibilities for t, so there are infinitely many solutions.

If F has elements, there are possibilities for t,
so there are at least solutions. (Note that there may be
solutions which are not of the form , so
there may be more than solutions. In fact, I'll be able to show
later than the number of solutions will be some power of
.)

Example. Since is an infinite
field, a system of linear equations over has no solutions,
exactly one solution, or infinitely many solutions.

Since is a field with 3 elements, a system of linear
equations over has no solutions, exactly one solution, or
at least 3 solutions. (And I'll see later that if there's more than
one solution, then there might be 3 solutions, 9 solutions, 27
solutions, ....)