Seeds and parallel MIP solvers, extended edition

In a previous post, I showed that changing the random seed on a given problem yielded varies solution times by between 20% to 40% on average, depending on the solver and the setting. I decided to push that experiment a little further and sample the distribution of run times based on the variation in random seeds. I took three lot-sizing models and run them for random seeds 1 to 2000 (I had this server with nothing planned on it so I put it to some use during my vacation).

In no particuliar order, here is what the distributions look like for CPLEX 12.6, for default settings (except the seed number), on a Intel Xeon processor with 4 cores active. The emphasis is here on variation and not on actual run times, so run times are reported as (run time for seed X – average) / average.

While the results are pretty much what I expected for instances I-2 and I-12, I-6 is a bit of an anomaly. The instance takes about 42 seconds to solve on average, and it seem dependent on a few key decisions that are taken very differently at the top of the search tree. Some of these decisions result in more nodes being evaluated and thus, longer run times.

And now what it looks like for Gurobi 5.6.2, default settings, same machine, and 4 cores:

These are somewhat what I expected. You can get the instances here. Each has 527 binary variables and twice that many continuous variables. If those variations seem unacceptable to you, then remember how things were in 2008, when the first parallel codes were available.

Comments

In your last sentence “If those variations seem unacceptable to you, then remember how things were in 2008, when the first parallel codes were available” it seems that you relate performance variability to multithreading. What about repeating the same experiment with one thread only (and several seeds of course)?

Thank you for your comment! I think your suggestion is a very good idea. I will post the results when it is completed.
As for the sentence you mentioned, I think the way I wrote it implies an association I didn’t mean to make. The first time I experienced these variations in performance was in 2008, and at that time we also had the first experiments in our lab with parallel MIPs. I noticed these because when comparing two versions of CPLEX, the parallel would be sometimes faster and sometimes slower for the same model. In this particular case, the run times for the sequential version were consistent, but I can’t say whether or not multithreading was the cause behind this (as there were two different releases of CPLEX).