Gaussian Elimination

Let's say we have a system of equations,

and we want to solve for , , and .
Well, one way to do this is with Gaussian Elimination, which you may have encountered before in a math class or two.

The first step is to transform the system of equations into a matrix by using the coefficients in front of each variable, where each row corresponds to another equation and each column corresponds to an independent variable like , , or .
For the previous system of equations, this might look like this:

Or more simply:

At first, translating the set of equations into a matrix like this doesn't seem to help with anything, so let's think of this in another way.

Row Echelon Form

Instead of the complicated mess of equations shown above, imagine if the system looked like this:

Then we could just solve for and plug that value in to the top two equations to solve for and through a process known as back-substitution.
In matrix form, this set of equations would look like this:

This matrix form has a particular name: Row Echelon Form.
Basically, any matrix can be considered in row echelon form if the leading coefficient or pivot (the first non-zero element in every row when reading from left to right) is right of the pivot of the row above it.

This creates a matrix that sometimes resembles an upper-triangular matrix; however, that doesn't mean that all row-echelon matrices are upper-triangular.
For example, all of the following matrices are in row echelon form:

The first two of these have the right dimensions to find a solution to a system of equations; however, the last two matrices are respectively under- and over-constrained, meaning they do not provide an appropriate solution to a system of equations.
That said, this doesn't mean that every matrix in the correct form can be solved either.
For example, if you translate the second matrix into a system of equations again, the last row translates into , which is a contradiction.
This is due to the fact that the matrix is singular, and there are no solutions to this particular system.
Nevertheless, all of these matrices are in row echelon form.

Reduced Row Echelon Form

Row echelon form is nice, but wouldn't it be even better if our system of equations looked simply like this:

Then we would know exactly what , , and are without any fuss! In matrix form, it looks like this:

This introduces yet another matrix configuration: Reduced Row Echelon Form.
A matrix is in reduced row echelon form if it satisfies the following conditions:

It is in row echelon form.

Every pivot is 1 and is the only nonzero entry in its column.

All the following examples are in the reduced row echelon form:

Again, only the first of these (the one that looks like an identity matrix) is desirable in the context of solving a system of equations, but transforming any matrix in this form gives us an immediate and definitive answer at the question: can I solve my system of equations?

Beyond solving a system of equations, reshaping a matrix in this form makes it very easy to deduce other properties of the matrix, such as its rank — the maximum number of linearly independent columns.
In reduced row echelon form, the rank is simply the number of pivots.

For now, I hope the motivation is clear: we want to convert a matrix into row echelon and then reduced row echelon form to make large systems of equations trivial to solve, so we need some method to do that.
In general, the term Gaussian Elimination refers to the process of transforming a matrix into row echelon form, and the process of transforming a row echelon matrix into reduced row echelon form is called Gauss-Jordan Elimination.
That said, the notation here is sometimes inconsistent.
Several authors use the term Gaussian Elimination to include Gauss-Jordan elimination as well.
In addition, the process of Gauss-Jordan elimination is sometimes called Back-substitution, which is also confusing because the term can also be used to mean solving a system of equations from row echelon form, without simplifying to reduced row echelon form.
For this reason, we will be using the following definitions in this chapter:

Gaussian Elimination: The process of transforming a matrix into row echelon form

Gauss-Jordan Elimination: The process of transforming a row echelon matrix into reduced row echelon form

Back-substitution: The process of directly solving a row echelon matrix, without transforming into reduced row echelon form

The Analytical Method

Gaussian elimination is inherently analytical and can be done by hand for small systems of equations; however, for large systems, this (of course) become tedious and we will need to find an appropriate numerical solution.
For this reason, I have split this section into two parts. One will cover the analytical framework, and the other will cover an algorithm you can write in your favorite programming language.

In the end, reducing large systems of equations boils down to a game you play on a seemingly random matrix with 3 possible moves. You can:

Swap any two rows.

Multiply any row by a non-zero scale value.

Add any row to a multiple of any other row.

That's it.
Before continuing, I suggest you try to recreate the row echelon matrix we made above.
That is, do the following:

There are plenty of different strategies you could use to do this, and no one strategy is better than the rest.
One method is to subtract a multiple of the top row from subsequent rows below it such that all values beneath the pivot value are zero.
This process might be easier if you swap some rows around first and can be performed for each pivot.

After you get a row echelon matrix, the next step is to find the reduced row echelon form. In other words, we do the following:

Here, the idea is similar to above and the same rules apply.
In this case, we might start from the right-most column and subtracts upwards instead of downwards.

The Computational Method

The analytical method for Gaussian Elimination may seem straightforward, but the computational method does not obviously follow from the "game" we were playing before.
Ultimately, the computational method boils down to two separate steps and has a complexity of .

As a note, this process iterates through all the rows in the provided matrix.
When we say "current row" (curr_row), we mean the specific row iteration number we are on at that time, and as before, the "pivot" corresponds to the first non-zero element in that row.

Step 1

For each element in the pivot column under the current row, find the highest value and switch the row with the highest value with the current row.
The pivot is then considered to be the first element in the highest swapped row.

For example, in this case the highest value is :

After finding this value, we simply switch the row with the to the current row:

In this case, the new pivot is .

In code, this process might look like this:

# finding the maximum element for each column
max_index = argmax(abs.(A[row:end,col]))+ row-1# Check to make sure matrix is good!if(A[max_index, col]==0)println("matrix is singular!")continueend# swap row with highest value for that column to the top
temp_vector = A[max_index,:]
A[max_index,:]= A[row,:]
A[row,:]= temp_vector

As a note, if the highest value is , the matrix is singular and the system has no single solution.
This makes sense because if the highest value in a column is 0, the entire column must be 0, thus there can be no unique solution when we read the matrix as a set of equations.
That said, Gaussian elimination is more general and allows us to continue, even if the matrix is not necessarily solvable as a set of equations.
Feel free to exit after finding a if your end-goal is to solve a system of equations.

Step 2

For the row beneath the current pivot row and within the pivot column, find a fraction that corresponds to the ratio of the value in that column to the pivot, itself.
After this, subtract the current pivot row multiplied by the fraction from each corresponding row element.
This process essentially subtracts an optimal multiple of the current row from each row underneath (similar to Step 3 from the above game).
Ideally, this should always create a 0 under the current row's pivot value.

For example, in this matrix, the next row is and the pivot value is , so the fraction is .

After finding the fraction, we simply subtract , like so:

After this, repeat the process for all other rows.

Here is what it might look like in code:

# Loop for all remaining rowsfor i =(row+1):rows
# finding fraction
fraction = A[i,col]/A[row,col]# loop through all columns for that rowfor j =(col+1):cols
# re-evaluate each element
A[i,j]-= A[row,j]*fraction
end

To be clear: if the matrix is found to be singular during this process, the system of equations is either over- or under-determined and no general solution exists.
For this reason, many implementations of this method will stop the moment the matrix is found to have no unique solutions.
In this implementation, we allowed for the more general case and opted to simply output when the matrix is singular instead.
If you intend to solve a system of equations, then it makes sense to stop the method the moment you know there is no unique solution, so some small modification of this code might be necessary!

So what do we do from here?
Well, we continue reducing the matrix; however, there are two ways to do this:

Reduce the matrix further into reduced row echelon form with Gauss-Jordan elimination

Solve the system directly with back-substitution if the matrix allows for such solutions

Let's start with Gauss-Jordan Elimination and then back-substitution

Gauss-Jordan Elimination

Gauss-Jordan Elimination is precisely what we said above; however, in this case, we often work from the bottom-up instead of the top-down.
We basically need to find the pivot of every row and set that value to 1 by dividing the entire row by the pivot value.
Afterwards, we subtract upwards until all values above the pivot are 0 before moving on to the next column from right to left (instead of left to right, like before).
Here it is in code:

voidgaussJordan(std::vector<std::vector<double>>&eqns){// 'eqns' is the (Row-echelon) matrix, 'rows' is no. of varsint rows = eqns.size();for(int i = rows -1; i >=0; i--){if(eqns[i][i]!=0){
eqns[i][rows]/= eqns[i][i];
eqns[i][i]=1;// We know that the only entry in this row is 1// subtracting rows from belowfor(int j = i -1; j >=0; j--){
eqns[j][rows]-= eqns[j][i]* eqns[i][rows];
eqns[j][i]=0;// We also set all the other values in row to 0 directly}}}}

This code does not exist yet in rust, so here's Julia code (sorry for the inconvenience)

As a note: Gauss-Jordan elimination can also be used to find the inverse of a matrix by following the same procedure to generate a reduced row echelon matrix, but with an identity matrix on the other side instead of the right-hand side of each equation.
This process is straightforward but will not be covered here, simply because there are much faster numerical methods to find an inverse matrix; however, if you would like to see this, let me know and I can add it in for completeness.

Back-substitution

The idea of back-substitution is straightforward: we create a matrix of solutions and iteratively solve for each variable by plugging in all variables before it.
For example, if our matrix looks like this:

We can quickly solve for , and then use that to solve for by plugging in for .
After that, we simply need to solve for in a similar fashion.
In code, this involves keeping a rolling sum of all the values we substitute, subtracting that sum from the solution column and then dividing by the coefficient variable.
In code, it looks like this:

Visual Representation

We have thus far used Gaussian elimination as a method to solve a system of equations; however, there is often a much easier way to find a similar solution simply by plotting each row in our matrix.
For the case of 2 equations and 2 unknowns, we would plot the two lines corresponding to each equation and the location of their point of intersection would be the solution for and .
Similarly, for the case of 3 equations and 3 unknowns, we would plot 3 planes and the location of their point of intersection would be the solution for , , and .

What, then, is the point of Gaussian elimination if we can simply plot our set of equations to find a solution?
Well, this analogy breaks down quickly when we start moving beyond 3D, so it is obvious we need some method to deal with higher-dimensional systems.
That said, it is particularly interesting to see what happens as we plot our matrix during Gaussian elimination for the 3D case.

Your browser does not support the video tag.

As we can see in the above visualization, the planes wobble about in 3D until they reach row echelon form, where one plane is parallel to the and axes.
At this point, it's trivial to find the -coordinate for the solution because it's simply the intercept of the parallel plane.
From there, the matrices become even easier to interpret as they move to the reduced row echelon form.
In this form, the solution is simply the , , and intercepts of the appropriate planes.

This visualization might have been obvious for some readers, but I found it particularly enlightening at first.
By performing Gaussian elimination, we are manipulating our planes such that they can be interpreted at a glance -- which is precisely the same thing we are doing with the matrix interpretation!

Conclusions

And with that, we have two possible ways to reduce our system of equations and find a solution.
If we are sure our matrix is not singular and that a solution exists, it's fastest to use back-substitution to find our solution.
If no solution exists or we are trying to find a reduced row echelon matrix, then Gauss-Jordan elimination is best.
As we said at the start, the notation for Gaussian Elimination is rather ambiguous in the literature, so we are hoping that the definitions provided here are clear and consistent enough to cover all the bases.

As for what's next... Well, we are in for a treat!
The above algorithm clearly has 3 for loops and has a complexity of , which is abysmal!
If we can reduce the matrix to a specifically tridiagonal matrix, we can actually solve the system in !
How? Well, we can use an algorithm known as the Tri-Diagonal Matrix Algorithm (TDMA) also known as the Thomas Algorithm.

There are also plenty of other solvers that do similar things that we will get to in due time.