It is possible to obtain the LU factorization "in place", at least for a diagonally dominant matrix. You can then do the solve/backsolve calculations on the constant vector (right-hand side of the linear system) "in place" as well. Would this be of interest?
–
hardmathJan 10 '11 at 4:25

I think the bottleneck is a way to do matrix multiplication in-place. I don't know how to do that, say for a square matrix times a vector, overwriting the vector. But if we could do that, then the referenced method seems to lend itself by recursion to an in-place matrix inversion.
–
hardmathJan 10 '11 at 14:13

1 Answer
1

A sequence of steps can be outlined, invoking recursively an in-place matrix inversion and other steps involving in-place matrix multiplication.

Some but not all of these matrix multiplication operations would seem to require additional memory to be "temporarily" allocated in between recursive calls to the in-place matrix inversion. At bottom we will argue that needed additional memory is linear, e.g. size m+n.

1. The first step is recursively to invert in-place matrix $\textbf{E}$:

This step can also be performed one column at a time, and because the results are accumulated in the existing block submatrix $\textbf{H}$, no additional memory is needed.

Note that what has now overwritten $\textbf{H}$ is $\textbf{S}=\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F}$.

4. & 5. The next two steps can be done in either order. We want to multiply block submatrix $\textbf{G}$ on the right by $\textbf{E}^{-1}$, and we also want to recursively invert in-place the previous result $\textbf{S}$. After doing both we have:

A temporary need for additional memory arises when we want to do matrix multiplication and store the result back into the location of one of the two factors.

Such a need arises in step 2. when forming $\textbf{E}^{-1}\textbf{F}$, in step 4. (or 5.) when forming $\textbf{G}\textbf{E}^{-1}$, in step 6. when forming $\textbf{S}^{-1}\textbf{G}\textbf{E}^{-1}$, and in step 8. when forming $-\textbf{E}^{-1}\textbf{F}\textbf{S}^{-1}$.

Of course other "temporary" allocations are hidden in recursive calls to in-place inversion in step 1. and step 4.&5. when matrices $\textbf{E}$ and $\textbf{S}$ are inverted.

In each case the allocated memory can be freed (or reused) after one column or one row of a required matrix multiplication is performed, because the overwriting can be done one column or one row at a time following its computation. The overhead for such allocations is limited to size m+n, or even max(m,n), i.e. a linear overhead in the size of $\textbf{A}$.

How would you handle the case where either $\textbf{E}$ or $\textbf{S}$ are zero matrices?
–
howardhAug 12 '12 at 18:19

@howardh: If $\mathbf{E}$ is the zero matrix in some step, the recursion outlined above fails (since that block would not be invertible), and similarly for $\mathbf{S}$ a zero matrix (or more generally, a singular one). The situation is similar to Gauss elimination without pivoting. The algorithm is certainly less robust because of its "in-place" limitation. If it were known in advance that $\mathbf{E}$ is zero, then the block structure of $\mathbf{A}^{-1}$ would also contain a zero block (in the lower right corner). Please post a new question if the details would interest you.
–
hardmathAug 12 '12 at 20:10