7 Gaussian Elimination and LU Factorization

Transcription

1 7 Gaussian Elimination and LU Factorization In this final section on matrix factorization methods for solving Ax = b we want to take a closer look at Gaussian elimination (probably the best known method for solving systems of linear equations) The basic idea is to use left-multiplication of A C m m by (elementary) lower triangular matrices, L, L,, L m to convert A to upper triangular form, ie, L m L m L L } {{ } A = U = e L Note that the product of lower triangular matrices is a lower triangular matrix, and the inverse of a lower triangular matrix is also lower triangular Therefore, LA = U A = LU, where L = L This approach can be viewed as triangular triangularization 7 Why Would We Want to Do This? Consider the system Ax = b with LU factorization A = LU Then we have L }{{} Ux = b =y Therefore we can perform (a now familiar) -step solution procedure: Solve the lower triangular system Ly = b for y by forward substitution Solve the upper triangular system Ux = y for x by back substitution Moreover, consider the problem AX = B (ie, many different right-hand sides that are associated with the same system matrix) In this case we need to compute the factorization A = LU only once, and then and we proceed as before: AX = B LUX = B, Solve LY = B by many forward substitutions (in parallel) Solve UX = Y by many back substitutions (in parallel) In order to appreciate the usefulness of this approach note that the operations count for the matrix factorization is O( 3 m3 ), while that for forward and back substitution is O(m ) Example Take the matrix A =

2 and compute its LU factorization by applying elementary lower triangular transformation matrices We choose L such that left-multiplication corresponds to subtracting multiples of row from the rows below such that the entries in the first column of A are zeroed out (cf the first homework assignment) Thus L A = = Next, we repeat this operation analogously for L (in order to zero what is left in column of the matrix on the right-hand side above): 0 0 L (L A) = = 0 3 = U Now L = (L L ) = L so that L = L with L = and L = Remark Note that L always is a unit lower triangular matrix, ie, it has ones on the diagonal Moreover, L is always obtained as above, ie, the multipliers are accumulated into the lower triangular part with a change of sign The claims made above can be verified as follows First, we note that the multipliers in L k are of the form l jk = a jk, j = k +,, m, a kk so that Now, let L k = l k+,k l m,k l k = 0 0 l k+,k 60 l m,k,

4 end L(j, k) = U(j, k)/u(k, k) U(j, k : m) = U(j, k : m) L(j, k)u(k, k : m) end Remark In practice one can actually store both L and U in the original matrix A since it is known that the diagonal of L consists of all ones The LU factorization is the cheapest factorization algorithm Its operations count can be verified to be O( 3 m3 ) However, LU factorization cannot be guaranteed to be stable The following examples illustrate this fact Example A fundamental problem is given if we encounter a zero pivot as in A = 5 = L A = Now the (,) position contains a zero and the algorithm will break down since it will attempt to divide by zero Example A more subtle example is the following backward instability Take A = + ε 5 with small ε If ε = then we have the initial example in this chapter, and for ε = 0 we get the previous example LU factorization will result in and The multipliers were L A = L L A = L = 0 ε ε ε ε Now we assume that a right-hand side b is given as b = 0 0 and we attempt to solve Ax = b via 6 = U

5 Solve Ly = b Solve Ux = y If ε is on the order of machine accuracy, then the 4 in the entry 4 6 ε insignificant Therefore, we have Ũ = 0 ε 3 and L = L, ε in U is which leads to LŨ = + ε A In fact, the product is significantly different from A Thus, using L and Ũ we are not able to solve a nearby problem, and thus the LU factorization method is not backward stable If we use the factorization based on L and Ũ with the above right-hand side b, then we obtain 3 ε x = 3 ε 3 3 Whereas if we were to use the exact factorization A = LU, then we get the exact answer 7 3 x = 3 3 4ε 7 ε 3 ε 3 ε ε 3 Remark Even though L and Ũ are close to L and U, the product LŨ is not close to LU = A and the computed solution x is worthless 7 Pivoting Example The breakdown of the algorithm in our earlier example with L A = can be prevented by simply swapping rows, ie, instead of trying to apply L to L A we first create 0 0 P L A = 0 0 L A = and are done 63

6 More generally, stability problems can be avoided by swapping rows before applying L k, ie, we perform L m P m L P L P A = U The strategy we use for swapping rows in step k is to find the largest element in column k below (and including) the diagonal the so-called pivot element and swap its row with row k This process is referred to as partial (row) pivoting Partial column pivoting and complete (row and column) pivoting are also possible, but not very popular Example Consider again the matrix A = + ε 5 The largest element in the first column is the 4 in the (3, ) position This is our first pivot, and we swap rows and 3 Therefore 0 0 P A = ε 5 = + ε 5, 0 0 and then L P A = ε 5 = 0 ε 0 Now we need to pick the second pivot element For sufficiently small ε (in fact, unless < ε < 3 ), we pick ε as the largest element in the second column below the first row Therefore, the second permutation matrix is just the identity, and we have 0 0 P L P A = ε = 0 ε To complete the elimination phase, we need to perform the elimination in the second column: 0 0 L P L P A = ε = 0 ε = U 0 (ε ) 0 3 ε 0 0 (ε ) The lower triangular matrix L is given by L = L L = and assuming that ε we get 0 0 L = 0 and Ũ = 4 64 (ε ),

7 If we now check the computed factorization LŨ, then we see LŨ = 5 = P Ã, which is just a permuted version of the original matrix A with permutation matrix 0 0 P = P P = Thus, this approach was backward stable Finally, since we have the factorization P A = LU, we can solve the linear system Ax = b as P Ax = P b LUx = P b, and apply the usual two-step procedure Solve the lower triangular system Ly = P b for y Solve the upper triangular system Ux = y for x This yields x = 7+4ε 3+ε ε 3 ε ε 3 If we use the rounded factors L and Ũ instead, then the computed solution is 7 3 x = 3, 3 which is the exact answer to the problem (see also the Maple worksheet 473 LUmws) In general, LU factorization with pivoting results in P A = LU, where P = P m P m P P, and L = (L m L m L L ) with L k = P m P k+ L k P k+ P m, ie, L k is the same as L k except that the entries below the diagonal are appropriately permuted In particular, L k is still lower triangular Remark Since the permutation matrices used here involve only a single row swap each we have P k = P k (while in general, of course, P = P T ) In the example above L = L, and L = P L P = L since P = I 65

8 Remark Due to the pivoting strategy the multipliers will always satisfy l ij A possible interpretation of the pivoting strategy is that the matrix P is determined so that it would yield a permuted matrix A whose standard LU factorization is backward stable Of course, we do not know how to do this in advance, and so the P is determined as the algorithm progresses An algorithm for the factorization of an m m matrix A is given by Algorithm (LU Factorization with Partial Pivoting) Initialize U = A, L = I, P = I for k = : m end find i k to maximize U(i, k) U(k, k : m) U(i, k : m) L(k, : k ) L(i, : k ) P (k, :) P (i, :) for j = k + : m end L(j, k) = U(j, k)/u(k, k) U(j, k : m) = U(j, k : m) L(j, k)u(k, k : m) The operations count for this algorithm is also O( 3 m ) However, while the swaps for partial pivoting require O(m ) operations, they would require O(m 3 ) operations in the case of complete pivoting Remark The algorithm above is not really practical since one would usually not physically swap rows Instead one would use pointers to the swapped rows and store the permutation operations instead 73 Stability We saw earlier that Gaussian elimination without pivoting is can be unstable According to our previous example the algorithm with pivoting seems to be stable What can be proven theoretically? Since the entries of L are at most in absolute value, the LU factorization becomes unstable if the entries of U are unbounded relative to those of A (we need L U = O( P A ) Therefore we define a growth factor One can show ρ = max i,j U(i, j) max i,j A(i, j) Theorem 7 Let A C m m Then LU factorization with partial pivoting guarantees that ρ m 66

9 This bound is unacceptably high, and indicates that the algorithm (with pivoting) can be unstable The following (contrived) example illustrates this Example Take A = Then LU factorization produces the following sequence of matrices: and we see that the largest element in U is 6 = m = U, Remark BUT in practice it turns out that matrices which such large growth factors almost never arise For most practical cases a realistic bound on ρ seems to be m This is still an active area of research and some more details can be found in the [Trefethen/Bau] book In summary we can say that for most practical purposes LU factorization with partial pivoting is considered to be a stable algorithm 74 Cholesky Factorization In the case when the matrix A is Hermitian (or symmetric) positive definite we can devise a faster (and more stable) algorithm Recall that A C m m is Hermitian if A = A This implies x Ay = y Ax for any x, y C m (see homework for details) If we let y = x then we see that x Ax is real Now, if x Ax > 0 for any x 0 then A is called Hermitian positive definite Remark Symmetric positive definite matrices are defined analogously with replaced by T 74 Some Properties of Positive Definite Matrices Assume A C m m is positive definite If X C m n has full rank, then X AX C n n is positive definite 67

10 Any principal submatrix (ie, the intersection of a set of rows and the corresponding columns) of A is positive definite 3 All diagonal entries of A are positive and real 4 The maximum element of A is on the diagonal 5 All eigenvalues of A are positive 74 How Does Cholesky Factorization Work? Consider the Hermitian positive definite A C m m in block form [ ] w A = w K and apply the first step of the LU factorization algorithm Then [ ] [ ] 0 T w A = w I 0 K ww The main idea of the Cholesky factorization algorithms is to take advantage of the fact that A is Hermitian and perform operations symmetrically, ie, also zero the first row Thus [ ] [ ] [ ] 0 T 0 T w A = w I 0 K ww 0 I Note that the three matrices are lower triangular, block-diagonal, and upper triangular (in fact the adjoint of the lower triangular factor) Now we continue iteratively However, in general A will not have a in the upper left-hand corner We will have to be able to deal with an arbitrary (albeit positive) entry in the (,) position Therefore we reconsider the first step with slightly more general notation, ie, [ a w A = ] w K so that A = [ a 0 T a w I ] [ 0 T 0 K a ww ] [ a a w 0 I ] = R A R Now we can iterate on the inner matrix A Note that the (,) entry of K ww /a has to be positive since A was positive definite and R is nonsingular This guarantees that A = R AR is positive definite and therefore has positive diagonal entries The iteration yields A = RA R so that and eventually A = R R A R R, A = RR Rm R } {{ } m R R, } {{ } =R =R 68

12 Remark The simplest (and cheapest) way to test whether a matrix is positive definite is to run Cholesky factorization it If the algorithm works it is, if not, it isn t The solution of Ax = b with Hermitian positive definite A is given by: Compute Cholesky factorization A = R R Solve the lower triangular system R y = b for y 3 Solve the upper triangular system Rx = y for x 70

SOLVING LINEAR SYSTEMS Linear systems Ax = b occur widely in applied mathematics They occur as direct formulations of real world problems; but more often, they occur as a part of the numerical analysis

Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

LU Decompositions We seek a factorization of a square matrix A into the product of two matrices which yields an efficient method for solving the system where A is the coefficient matrix, x is our variable

.5 Elementary Matrices and a Method for Finding the Inverse Definition A n n matrix is called an elementary matrix if it can be obtained from I n by performing a single elementary row operation Reminder:

, Continued and The of a Matrix Calculus III Summer 2013, Session II Monday, July 15, 2013 Agenda 1. The rank of a matrix 2. The inverse of a square matrix Gaussian Gaussian solves a linear system by reducing

lementary Matrices and The LU Factorization Definition: ny matrix obtained by performing a single elementary row operation (RO) on the identity (unit) matrix is called an elementary matrix. There are three

Section Notes 3 The Simplex Algorithm Applied Math 121 Week of February 14, 2011 Goals for the week understand how to get from an LP to a simplex tableau. be familiar with reduced costs, optimal solutions,

ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

MAT 171 8.5 Solving Linear Systems Using Matrices and Row Operations A. Introduction to Matrices Identifying the Size and Entries of a Matrix B. The Augmented Matrix of a System of Equations Forming Augmented

LECTURE 5 Solving Systems of Linear Equations Recall that we introduced the notion of matrices as a way of standardizing the expression of systems of linear equations In today s lecture I shall show how

Lecture 6 Inverse of Matrix Recall that any linear system can be written as a matrix equation In one dimension case, ie, A is 1 1, then can be easily solved as A x b Ax b x b A 1 A b A 1 b provided that

55 CHAPTER NUMERICAL METHODS. POWER METHOD FOR APPROXIMATING EIGENVALUES In Chapter 7 we saw that the eigenvalues of an n n matrix A are obtained by solving its characteristic equation n c n n c n n...

Solving Linear Systems of Equations Gerald Recktenwald Portland State University Mechanical Engineering Department gerry@me.pdx.edu These slides are a supplement to the book Numerical Methods with Matlab:

Determinants Dr. Doreen De Leon Math 52, Fall 205 Determinant of a Matrix Elementary Matrices We will first discuss matrices that can be used to produce an elementary row operation on a given matrix A.

Linear Programming Linear programming refers to problems stated as maximization or minimization of a linear function subject to constraints that are linear equalities and inequalities. Although the study

MATH 23: SYSTEMS OF LINEAR EQUATIONS Systems of Linear Equations In the plane R 2 the general form of the equation of a line is ax + by = c and that the general equation of a plane in R 3 will be we call

Appendix A Introduction to Matrix Algebra I Today we will begin the course with a discussion of matrix algebra. Why are we studying this? We will use matrix algebra to derive the linear regression model

A FIRST COURSE IN LINEAR ALGEBRA An Open Text by Ken Kuttler Systems of Linear Equations Lecture Notes by Karen Seyffarth Adapted by LYRYX SERVICE COURSE SOLUTION Attribution-NonCommercial-ShareAlike (CC

NUMERICALLY EFFICIENT METHODS FOR SOLVING LEAST SQUARES PROBLEMS DO Q LEE Abstract. Computing the solution to Least Squares Problems is of great importance in a wide range of fields ranging from numerical

3 MATH FACTS 0 3 MATH FACTS 3. Vectors 3.. Definition We use the overhead arrow to denote a column vector, i.e., a linear segment with a direction. For example, in three-space, we write a vector in terms

Numerical Analysis Lecture Notes Peter J. Olver 4. Gaussian Elimination In this part, our focus will be on the most basic method for solving linear algebraic systems, known as Gaussian Elimination in honor

4. MATRICES 170 4. Matrices 4.1. Definitions. Definition 4.1.1. A matrix is a rectangular array of numbers. A matrix with m rows and n columns is said to have dimension m n and may be represented as follows:

Matrices define matrix We will use matrices to help us solve systems of equations. A matrix is a rectangular array of numbers enclosed in parentheses or brackets. In linear algebra, matrices are important

2.1: Determinants by Cofactor Expansion Math 214 Chapter 2 Notes and Homework Determinants The minor M ij of the entry a ij is the determinant of the submatrix obtained from deleting the i th row and the

Gaussian Elimination We list the basic steps of Gaussian Elimination, a method to solve a system of linear equations. Except for certain special cases, Gaussian Elimination is still state of the art. After

Chapter 8 Matrices II: inverses We have learnt how to add subtract and multiply matrices but we have not defined division. The reason is that in general it cannot always be defined. In this chapter, we

1. LINEAR EQUATIONS A linear equation in n unknowns x 1, x 2,, x n is an equation of the form a 1 x 1 + a 2 x 2 + + a n x n = b, where a 1, a 2,..., a n, b are given real numbers. For example, with x and

Similar matrices and Jordan form We ve nearly covered the entire heart of linear algebra once we ve finished singular value decompositions we ll have seen all the most central topics. A T A is positive

Harvey Mudd College Math Tutorial: Solving Systems of Linear Equations; Row Reduction Systems of linear equations arise in all sorts of applications in many different fields of study The method reviewed

Lecture 10: Invertible matrices. Finding the inverse of a matrix Danny W. Crytser April 11, 2014 Today s lecture Today we will Today s lecture Today we will 1 Single out a class of especially nice matrices

Solving Systems of Linear Equations There are two basic methods we will use to solve systems of linear equations: Substitution Elimination We will describe each for a system of two equations in two unknowns,

Lecture 1: Schur s Unitary Triangularization Theorem This lecture introduces the notion of unitary equivalence and presents Schur s theorem and some of its consequences It roughly corresponds to Sections

1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables

The Inverse of a Matrix 7.4 Introduction In number arithmetic every number a ( 0) has a reciprocal b written as a or such that a ba = ab =. Some, but not all, square matrices have inverses. If a square

Chapter 7 Eigenvalues and Eigenvectors In this last chapter of our exploration of Linear Algebra we will revisit eigenvalues and eigenvectors of matrices, concepts that were already introduced in Geometry

1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

LECTURE I. Inverse matrices We return now to the problem of solving linear equations. Recall that we are trying to find such that A = y. Recall: there is a matri I such that for all R n. It follows that

MATRICES WITH DISPLACEMENT STRUCTURE A SURVEY PLAMEN KOEV Abstract In the following survey we look at structured matrices with what is referred to as low displacement rank Matrices like Cauchy Vandermonde

Helpsheet Giblin Eunson Library ATRIX ALGEBRA Use this sheet to help you: Understand the basic concepts and definitions of matrix algebra Express a set of linear equations in matrix notation Evaluate determinants