Optimizing for a multiprocessor: Balancing synchronization costs against parallelism in straight-line code

Abstract

This paper reports on the status of a research project to develop compiler techniques to optimize programs for execution on an asynchronous multiprocessor. We adopt a simplified model of a multiprocessor, consisting of several identical processors, all sharing access to a common memory. Synchronization must be done explicitly, using two special operations that take a period of time comparable to the cost of data operations. Our treatment differs from other attempts to generate code for such machines because we treat the necessary synchronization overhead as an integral part of the cost of a parallel code sequence. We are particularly interested in heuristics that can be used to generate good code sequences, and local optimizations that can then be applied to improve them. Our current efforts are concentrated on generating straight-line code for high-level, algebraic languages.

We compare the code generated by two heuristics, and observe how local optimization schemes can gradually improve its quality. We are implementing our techniques in an experimental compiler that will generate code for Cm*, a real multiprocessor, having several characteristics of our model computer.

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This research is sponsored by the Defense Advanced Research Projects Agency (DOD), ARPA Order No. 3597, monitored by the Air Force Avionics Laboratory Under Contract F33615-78-C-1551.

The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Defense Advanced Research Projects Agency or the US Government.