Optimization Algorithms for Exploiting the Parallelism-Communication Tradeoff in Pipelined Parallelism

We address the problem of finding parallel plans for SQL queries using the two-phase approach of join ordering followed by parallelization. We focus on the parallelization phase and develop algorithms for exploiting pipelined parallelism. We formulate parallelization as scheduling a weighted operator tree to minimize response time. Our model of response time captures the fundamental tradeoff between parallel execution and its communication overhead. We assessthe quality of an optimization algorithm by its performance ratio which is the ratio of the response time of the generated schedule to that of the optimal. We develop fast algorithms that produce near-optimal schedules - the performance ratio is extremely close to 1 on the average and has a worst case bound of about 2 for many cases.