Classification and generation of schedules for VLIW processors

Christoph W. Keßler, Andrzej Bednarski, Mattias Eriksson

Summary:

We identify and analyze different classes of schedules for
instruction-level parallel processor architectures.
The classes are induced by various common techniques for generating
or enumerating them, such as integer linear programming or list scheduling
with backtracking.
In particular, we study the relationship between VLIW schedules and their
equivalent linearized forms (which may be used, e.g., with
superscalar processors), and we
identify classes of VLIW schedules that can be created from a linearized form
using an in-order VLIW compaction heuristic,
which is just the static equivalent of the dynamic instruction dispatch
algorithm of in-order issue superscalar processors.
We formulate and give a proof of the dominance of
greedy schedules for instruction-level parallel architectures where all
instructions have multiblock reservation tables, and we show how
scheduling anomalies can occur in the presence of
instructions with non-multiblock reservation tables.
We also show that, in certain situations,
certain schedules generally cannot be constructed
by incremental scheduling algorithms that are based
on topological sorting of the data dependence graph.
We also discuss properties of strongly linearizable schedules,
out-of-order schedules and non-dawdling schedules,
and show their relationships to greedy schedules and to general schedules.
We summarize our findings as a hierarchy of classes of VLIW schedules.
Finally we provide an experimental evaluation showing
the sizes of schedule classes in the above
hierarchy, for different benchmarks and example VLIW
architectures, including a single-cluster version of the
TI C62x DSP processor and variants of that.
Our results can sharpen the interpretation of the term optimality
used with various methods for optimal VLIW scheduling,
and help to identify sets of schedules that can be safely ignored
when searching for a time-optimal schedule.

Key words:

Comments:

A
very early version of this paper was presented at
CPC'06 Int. Workshop on Compilers for Parallel Computers, A Coruna, Spain,
Jan. 2006.
The paper was then complemented by a new theorem and proof
stating the dominance of
greedy schedules for instruction-level parallel architectures where all
instructions have multiblock reservation tables. We extended the
hierarchy of schedule classes by out-of-order schedules,
and provided an experimental
evaluation showing the sizes of schedule classes in the above
hierarchy, for different benchmarks and example VLIW
architectures, including a single-cluster version of the
TI C62x DSP processor and variants of that.