Experimenting with LocalSolver 3.0 [primal]

This post presents and discusses results of some experiments performed with LocalSolver 3.0, a solver based on the local search paradigm. I investigate whether LS is able to find good solutions quickly and reliably. This post is a follow-up of my first experiments with LocalSolver; I encourage you to read this if you didn’t do so already. The dual post describes what are the important questions I wanted to investigate during my testing. Two types of mono-objective binary optimization problems are used: multidimensional knapsack [1] and the two-echelon hierarchical single-source capacitated facility location problem [2].

Is LocalSolver able to find good solutions quickly?

From my limited experience, it turns out that it depends on the problem.

LocalSolver does really well on multidimensional knapsack problems, landing solutions close to the optimum. Solution quality is often within 0.5%-to-1% range of upper bounds or optimal solutions found using CPLEX 12.5, even for relatively large models (5000+ decisions) and with short run times (60 seconds).

For the smaller instances, solutions are 0.57% away from upper bound on average;

For the larger instances, solutions are 0.89% away from upper bound on average.

Since these are gaps to bounds rather than to the optimum, solution quality is probably slightly better than what is reported for large instances.

On the selected facility location models, the story is quite different. Although LS does well on some instances (2 optima found), in general solution quality is bad.

Gap to optimality is often in the 40-50% ratio, meaning that LocalSolver is not even in the ballpark of good solutions.

I suspect that this poor performance is at least partly explained by the fact that a large proportion of generic swap-type moves are unlikely to provide any improvement in solution quality. This should be investigated in a later post.

Does LocalSolver provide a reliable performance?

LocalSolver is very stable and reliable. Experiments on the same instance and on the same machine using different random seeds or multiple runs using the same seed yield very stable solution quality, with a coefficient of variation of about 0.1%. Results are in this order of magnitude when using single or multiple threads and on short or long run times (1 minute up to 1 hour). This is true when LS does well (on knapsack) but it’s also true when LS does not well (2nd problem).

How does LocalSolver scale with instance size?

LocalSolver has low memory requirements when compared to a MIP solver. For instance, a 5000 x 5 knapsack instance requires about 18 Mb of RAM to solve, while a 5000 x 10 instance requires about 22 Mb. A MIP solver may require 1+ GB of RAM to solve these models.

Parallelism in LS has a relatively low overhead cost. The number of moves evaluated per second per thread remains quite constant. Overhead cost is estimated about 6% for two threads and about 10% for the 4 thread execution.

The solver handles large models quite well. While it requires more effort to find good solutions on larger models, its execution scales quite well with instance size. For instance, for knapsack instances:

Doubling the number of constraints (with the number of decision variables remaining constant) results in about 28.8% less moves evaluated per thread per second.

Doubling the number of variables (with the number of constraints remaining constant) results in about 18.3% less moves evaluated per thread per second.

What about the models you used?

In order to perform these experiments, I used two types of models. The first one is called the multidimensional knapsack problem [1] and is a rather simple and straightforward problem which is inherently binary in nature. Two sets of instances have been generated: the first one (A) contains 25 smaller instances having 50 to 1000 items and 5 to 10 constraints, while the second one (B) contains 70 instances having 1000 to 10000 items and 5 to 20 constraints. The second model I used is the two-echelon hierarchical single-source capacitated facility location problem [2], a supply chain design problem that I know quite well. For this model, I used the path-based formulation detailed in [3]. Two sets of 25 instances have been generated, each having from 150 to 812500 decision variables. You can find the instance files here.

That’s too short! Please provide details!

It is difficult to summarize findings resulting of 1000+ runs on about 150 models in 600 words. If readers are interested I can provide more details and information in upcoming posts dedicated to just one dimension of the experiments. Please tell me what you would like to hear about!