4.8 Ill-conditioned Matrices

Introduction

In attempting to solve the system Mx = b, we must
be able to account for errors in both the matrix M and
the vector b. Unfortunately, the error propagation is not linear:
a 10% relative error in either M or b does not translate
to a 10% relative error in the solution. We will show how the error
propagation involves a property of the matrix called the condition
number defined using the norms of the matrix.

Please note that |x|, ||v||, and |||M||| represent
the absolute value of a real number, the norm of a vector, and the norm
of a matrix, respectively. Numerous texts use ||M|| to represent
the norm of a matrix; however, the author uses three bars for clarity.

References

Theory

In some cases, the solution to a system of linear equations
Mx = b may be very sensitive to small changes in either
the matrix M or the vector b—a relatively change
in either can result in a significant change in the solution x.
This can cause problems in engineering as any physical system is made
of components which are not exact: a silver-banded 120 Ω resistor
could have an actual resistance anywhere on the range [108 Ω, 132 Ω].

A Well Conditioned Example

In order to motivate this discussion, we will look at two matrices: the
first is well conditioned—small changes in either M or b
result in correspondingly small changes in x.

Consider this first matrix M:

To determine the behaviour of this matrix, we will look at the image
of the unit circle, as is shown in Figure 1.

Figure 1. The image of the unit circle under M.

Solving Mx = b means that we want to find that point x
which maps to b, as is shown in Figure 2 which shows the
pre-image of b = (1, 1)T. The solution to this system
of linear equations is approximately
the point x = (0.1765, 0.2353)T.

Figure 2. The pre-image of the point b = (1, 1)T.

Suppose the vector b is shifted slightly. Figure 3 shows a circle
of points around b which have a 10% relative error when compared
to b. The pre-image of each of these points forms an ellipse around
the point x.

Figure 3. The pre-image of the point b = (1, 1)T.

For this matrix, a small error in b translates to a small error
in x, but we have more information: the length of the major axes
of the ellipse yeilds the maximum error while the length of the minor
axes yeilds also the minimum error. Suppose, however, that there is an error in the
matrix M. Consider this perturbed matrix:

Unlike 2-dimensional vectors, it is not possible to represent all matrices
which have a particular relative error. The above matrix has a relative
error of approximately 11.9%. If you look at Figure 4, you will see
the image of the unit circle under this perturbed matrix.

Figure 4. The image of the unit circle under a perturbed M.

The pre-image of b = (1, 1)T under the perturbed
M is close to the original, too. As we can see in Figure 5,
the pre-image is x = (0.1493, 0.2239)T which has
a relative error of only 11% as compared to x = (0.1764, 0.2353)T.

Figure 5. The image of the unit circle under a perturbed M.

Again, a small change in M results in a small change in x.

Figure 6 demonstrates how the images of the unit circle under both
M and the perturbed M are close.

Figure 6. The images of the unit circle under M and a perturbed M.

An Ill-conditioned Example

The
second is ill conditioned—small changes in either M or b
result in very significant changes in x.

Consider this second matrix M:

To determine the behaviour of this second matrix, we will look at the image
of the unit circle, as is shown in Figure 7.

Figure 7. The image of the unit circle under M.

This is very different from the image shown in Figure 1: the unit circle
is stretched into a ellipse which almost approximates a straight line.
Recall that a singular matrix would map the unit circle into either a
line or a point; however, the determinant of this matrix M is
1—the same determinant as the identity matrix. (What is critical
here is that the determinant cannot be used to determine the
conditioning of a matrix.)

Solving Mx = b means that we want to find that point x
which maps to b, as is shown in Figure 2 which shows the
pre-image of b = (1, 1)T. The solution to this system
of linear equations is the point x = (1, -1)T.

Figure 8. The pre-image of the point b = (1, 1)T.

Suppose the vector b is shifted slightly. Figure 9 shows a circle
of points around b which have a 10% relative error when compared
to b. The pre-image of each of these points forms an
elongated ellipse around
the point x.

Figure 9. The pre-image of the point b = (1, 1)T.

What is significantly different here is that a small error could translate
into a relative error in x which is almost 100%. This is not
the worst case, either—there are other vectors where a 10% relative
error in b can translate into a relative error in
x almost as large as 120%.

For this matrix, a small error in b translates to a significantly
larger error in x. Suppose, similarly, that there is an error in the
matrix M. Consider this perturbed matrix:

The relative error of this perturbed matrix is only 4.2% compared to the
relative error of the previous perturbed matrix (11.9%). Thus, it is,
in a relative sense, a smaller perturbation.

If you look at Figure 10, you will see
the image of the unit circle under this perturbed matrix.

Figure 10. The image of the unit circle under a perturbed M.

The pre-image of b = (1, 1)T under the perturbed
M is significatly different: Figure 11 shows the
pre-image is x = (-1.735, 2.233)T which has
a relative error close to 301% as compared to x = (1, -1)T.

Figure 11. The image of the unit circle under a perturbed M.

Again, a small change in M results in a massive change in x.

Figure 12 demonstrates how the images of the unit circle under both
M and the perturbed M are significantly different as
compared to the previous well conditioned example.

Figure 12. The images of the unit circle under M and a perturbed M.

The Condition Number

To quantify this effect, it is necessary to consider the image of the
unit circle. In the first case (Figure 1), most vectors are stretched
by a similar amount: the most a vector is streched (|||M|||2)
is 5.390 is very close to the least amount that a vector is stretched,
approximately 3.154. This second value may be described
as 1/|||M-1|||2. These two values in Figure 1 are approximately 5.390 and 3.154, respectively. In Figure 7, however, these
two values are signifantly different: 12.08 versus 0.08276. The
ratio between these two values is defiend to be the condition
number of a matrix, and in the first case, it is
cond2(M) = 5.390/3.154 = 1.709 and in the second
cond2(M) = 12.08/0.08276 = 146.0.

Results

If Δb is the error in b then
||Δb||2/||b||2 is the
relative error in b and with this value and the condition
number cond2(M) = |||M|||2|||M-1|||2, we may bound the relative error of x:

As the reader may have noticed, we are focusing on the 2-norm; however,
these bounds are equally valid with both the 1- and ∞-norms so long
as the same norm is used in all cases. The appropriate condition numbers
are given as cond1(M) and cond∞(M),
respectively.

The condition number of a non-invertable matrix is defined as
∞.

HOWTO

Given a system of linear equations Mx = b, the
condition number of M, cond(M) = |||M|||·|||M-1|||, will give bounds on the
error propagation: The relative error of the solution
x is bounded above by the relative error of b multiplied by the condition number:

At the cost of introducing a more complex formula, we can also
find a bound which accounts for errors in the matrix M.
From Bernstein, we have the formula:

Error Analysis

Unlike most sections on error analysis, as this entire section already
deals with error analysis, the author will devote this section to a different
type of error propagation.

The formula above which relates the relative error of both
b and M relate to the relative error of the solution
is some times misinterpreted as the simpler

This statement is incorrect as the following counter example
demonstrates:

which relates the relative error of the erroneous result x + Δx
to the relative error of M.

Examples

1. The condition number of the matrix [5 1; -2 10] has
a condition number of 2. When b = (0.0054784, 0.9009365)T,
x = (-0.016272, 0.086839)T while a perturbed
b = (0.095572, 0.901546)T has a relative error of 10%
and the corresponding solution
x = (0.0010417, 0.0903629)T has a corresponding relative error of almost 20%.

2. If M is a digaonal matrix, then the condition number
is the ratio of the largest absolute diagonal entry over
the smallest absolute diagonal entry. Therefore, the condition
number of M = [5 0; 0 4] is 1.25.

As a consequence, scaling a matrix or multiplying it by a rotation matrix should not affect the condition number.

Questions

1. What is the condition number of the matrix [5 0; 0 2]?

2. What is the condition number of a rotation matrix
[cos(θ) -sin(θ); sin(θ) cos(θ)]?

3. Given the matrix M = [2 3; 1 2] and given

>> inverse( M )
ans =
2 -3
-1 2

what are the 1- and ∞-condition numbers of the matrix? (Both are 25)

4. What are the 1-, 2-, and ∞-condition numbers of the elementary
matrices Ei,j (swap rows i and j)
and Ei; c (multiply row i by c); and what are
the 1- and ∞-condition numbers of the elementary matrix
Ei,j; c (add c times row j onto row i)?

Applications to Engineering

Each circuit element, be it a resistor, capacitor, inductor, or transistor,
has some amount of error associated with it. If each entry of a
conductance matrix G is off by 10%, the relative
error of the |||ΔG|||/|||G|||| could be as large
as 10%. With dense matrices and random errors, this bound appears
less likely to be achieved; however, for diagonally dominant matrices,
perturbations on the diagonal entries will directly affect the
relative error.

Similarly, errors in any currents will similarly affect the result.
Thus, having a conductance matrix which has as small a condition
number as possible will minimize the effects of these errors.

Matlab

The condition number of a matrix M is found in Matlab with
the cond(M) command which uses the 2 norm by default. The condition number using the 1- or
&infty;-norm may be found using cond(M, 1) or cond(M, Inf),
respectively.

Matlab also provides a function condest(M) which provides
an approximation to the condition number using the 1-norm.