Abstract

Conducting a thorough performance evaluation of an STM is very time consuming. Depressingly, even with all this effort, and even with the same application, it can still be hard to predict the performance if the number of underlying threads on which the application needs to be deployed is different than those of the experiment. Basically, one might have to conduct an entire set of new experiments to get some understanding of the performance of the STM with the new number of threads.
We propose a pragmatic approach to contribute to changing this state of affairs. Using classical engineering approximation techniques, we extract from a set of STM performance measurements, analytical performance functions to model the scalability of the STM. We show, more specifically, that polynomial and rational functions provide good interpolations of STM performance: even with only a handful of measurements, the average error in most cases is around 1-2%. Further, we show that we can perform reasonably precise extrapolation using rational functions: basically, using measurements with up to m threads, we can predict the performance up to roughly 2m threads with a relatively low error (around 10% in best cases).
We discuss two possible applications of our approach: (1) statically deciding whether to use an STM for a given workload and a given number of threads, and (2) dynamically adjusting the number of threads that execute in parallel to match the optimal concurrency level of a given workload.