Optimal data distributions for LU decomposition

Abstract

The paper considers the well-known problem of LU decomposition to study a method to derive data distributions for parallel computers with a distributed memory organization. The importance of the paper lies not so much in the special application but with the principle that the problem of finding an optimal data distribution is formulated as an optimization problem. This is possible by using a parameterized data distribution and a rigorous performance prediction technique that allows us to derive runtime formulas containing the parameters of the data distribution. The parameters are determined in such a way that the total runtime is minimized, thus also minimizing the communication overhead and the load imbalance penalty.