Closing the (MIP) Gap – Part II

Day #2 of the MIP 2013 Workshop reminded me that despite the numerous research efforts invested in mixed-integer programming in the last 60 years or so, much remains to be done. A lot of MIP models are still very difficult to solve. In MIP extensions such as nonlinear, multi-objective or multi-level programming, most large-sized problems cannot be solved efficiently.

In a previous post published in May, I showed that for several models, despite long times needed to reach the optimality threshold (0.01% MIP gap) MIP solvers were often able to get very good solutions in a few seconds. This time, I show an example of the contrary: a model for which each fraction of percentage is really important, and for which solvers have trouble finding an optimal solution quickly.

A supply chain design model

The example I take today is a supply chain design model from the pulp and paper (forest products) industry. A supply chain design model has three types of decisions:

Location decisions, specifying which of a company’s sites and facilities will be used among a set of possibilities. These decisions may include facility reconfiguration or expansion decisions;

Mission assignment or allocation decisions, specifying which activities (parts manufacturing, sub-assembly, final assembly, distribution, packaging, etc.) should be performed at each facility;

Supply chain design decisions have critical impact on a company’s ability to compete. Thus, a 1% gap to optimality usually results in millions of $ of lost competitiveness. I took a sample instance from M’Barek, Martel & D’Amours [1], which corresponds to scenario #3 of their paper. The model is complex: 1888 binary variables as about 143,000 continuous variables, although not of enormous for a SCN design model.

How CPLEX tackles this model

A few comments about how this model is solved by CPLEX 12.5:

Solving the root relaxation of this model takes about 30 seconds;

It finds an heuristic solution in about 5 minutes. However, this solution is pretty bad: CPLEX shows a gap of 207%, and this solution is actually far away from the true optimum (about 200% gap). Integer infeasibility is pretty high at 187 too.

A large number of cuts, mostly implied bounds and flow cuts, are generated (13,000 or so), but it only slightly tightens the relaxation.

On a decent machine with 4 threads and otherwise default settings, the model takes about 55,500 seconds to solve. Only 7133 nodes need to be explored, though.

Other versions of this model take even longer to solve. Those interested can view the CPLEX log here.

Why is this problem difficult?

There are many reasons. But among them:

Lots of inter-related decisions: facility location & configuration are linked and it’s difficult to separate this into parts;

Problem size: lots of products, facilities & customers;

Multi-period irregular structure: decision variables and coefficients are not the same in every period.