Quickly Discover Low-Rank Structure with GMRES

March 03, 2019

This is a short but neat addition to my previous post. In that post I showed one quick way to circumvent structure requirements in linear algebra software by exploiting low-rank-update properties of a matrix. This approach had one weakness however: we must know the low-rank-update structure ahead of time in order to use the sherman-morrison formula. In reality one has to be very lucky indeed to exactly know this structure and exploit it, thus here I show how we can still use the low-rank-update structure if we don’t even know it! It does mean we can no longer use the sherman-morrison formula, but fortunately there is a slick algorithm that simultaneously gives a solution to the linear system and also has a strong tendency to uncover low-rank information automatically: Generalized Minimum Residual (GMRES). As before I will give a little mathematical background, and then some Python code you can play with.

Summary of problem

GMRES is a sophisticated algorithm with very deep mathematics and history behind it. I have talked about it before and it played a central role in my PhD thesis - but there is just too much to say about it that I can’t do it justice with just a few paragraphs, so I will simply say that GMRES is an iterative algorithm for solving a system of linear equations.

The system of linear equations in this case will be to solve for \( x \) in

\[ Bx = b \]

where

\[ B = A + UV^T \]

and \(B,A \) are \( n \times n \) matrices, and \(U,V\) are \(n \times k \) matrices. (If you’re following along from the previous post then this is a rank-k update rather than only a rank-1 update). The idea here is that \( A \) will have some very special structure which we can exploit and perform very fast solves for systems \( Ay=c \), and also that \( k \) is small.

In this post however I assume that we don’t know what \( U,V \) are, only that they exist. Thus we can’t use low-rank-update formulas as they require knowledge of \(U,V\).

This turns out to be fine however. I show how to still exploit this structure in Python below, but I start by showing what happens if we do nothing except a naive attempt at solving with GMRES.

Python Example: Not using low-rank information

If we try and solve for \(B\) directly just by naively calling GMRES in Python the results will not be that great:

This is bad. GMRES is only useful as an algorithm if it takes much fewer iterations than the full problem size \( n \)! We need to exploit the low-rank structure now.

Python Example: Using low-rank information

To exploit low-rank structure here I will give \( A \) as a preconditioner to GMRES. Thus instead of solving

\[ Bx= b \]

we solve

\[ A^{-1} B x = A^{-1}b \].

In general it’s too expensive to fully calculate \(A^{-1} B \) for large \( n \), but since the incoming assumption is that it’s fast to solve systems with \( A \) we need only supply that as a callback to GMRES, I demonstrate this below:

importnumpyasnpimportscipy.sparse.linalgasspimportscipy.linalgaslin#Size of our problemn=250#Size of our low-rank update.k=4#Make a random matrix AA=np.random.rand(n,n)#Make our random low-rank-update matricesu=np.random.rand(n,k)v=np.random.rand(n,k)#Form low-rank update for BB=A+u@v.transpose()#Form a random right-hand-side vector. We want to find "x" in the equation Bx=b.b=np.random.rand(n)#Define a simple callback that tells us#how quickly GMRES is convergingit=0defcallback(rk):globalitprint("it: {}, res: {}".format(it,np.linalg.norm(rk)))it=it+1#Define callback for solving linear systems with A#Here I just use a generic linear solver#but if A had useful structure we would use a corresponding#fast solver insteaddefsolveA(x):returnlin.solve(A,x)#Define preconditioner input to GMRESM=sp.LinearOperator(matvec=solveA,shape=(n,n))sp.gmres(B,b,M=M,callback=callback,maxiter=n,restart=n)

That means GMRES calls the preconditioner \( k+1 \) times, so if \( k \) is small and \( M \) is cheap to evaluate then this is potentially a big win.

Why does this work? Eigenvalues!

The trick here is we need to know \( A \). If we know this, then

\[ A^{-1} B = I + A^{-1} UV^T \].

This presents an ideal situation to GMRES. GMRES works very well when there is clustering of eigenvalues. In this case We will have at least \(n-k\) eigenvalues exactly equal to \( 1 \) (coming from the identity matrix \( I \) ), and then \( k \) possibly different eigenvalues coming from \( A^{-1} UV^T \). The cluster of \(n-k\) eigenvalues gets resolved immediately by GMRES in a single iteration, and then the remaining \( k \) take an iteration each (unless by chance they also happen to be clustered very closely together).

Concluding Remarks

Here I show how to use low-rank-update structure even if it is unknown. I used the base matrix as a preconditioner input to GMRES and if the original system does indeed have low-rank-update structure then GMRES converges in the same number of iterations as the rank of the update. Thus if the rank of the update is small, preconditioned GMRES will converge rapidly, only requiring a few iterations of the base matrix solver.