I am a full-time consultant and provide services related to the design, implementation and deployment of mathematical programming, optimization and data-science applications. I also teach courses and workshops. Usually I cannot blog about projects I am doing, but there are many technical notes I'd like to share. Not in the least so I have an easy way to search and find them again myself. You can reach me at erwin@amsterdamoptimization.com.

Monday, May 30, 2016

There are quite a few models that use the ‘all-different’ constraint, i.e. a set of integer variables \(x_i\), \(i=\{1,...,n\}\) is feasible only if they are all different, i.e. \(x_i\ne x_j\) for \(i\ne j\). We can probably say that most of these models are of the “educational” type and are not practical production models.

Constraint programming solvers have typically a built-in “all-different” global constraint. This makes it easy to write down such a constraint and the solver will have knowledge about the constraint which it can exploit. That means we can expect better performance than for instance a bunch of pairwise not-equal constraints. So the first observation is: if you have a model that is largely built around an all-different constraint, consider to implement the model using a constraint programming solver.

Approach 1

One way to implement this construct for use in a MIP model is to use pairwise comparison: \(x_i\ne x_j\) for \(i\ne j\). In a MIP we need binary variables for this:

Note that we only need to compare \(x_i\) and \(x_j\) if \(i<j\). This means we need \(\frac{n(n-1)}{2}\) binary variables. It is important to choose good values for \(M\) so let’s work on that for a minute (see here for an example where things go wrong if we don’t pay attention to this). Note that instead of using a single \(M\) we use different values \(M^{(1)}_{i,j}\) and \(M^{(2)}_{i,j}\).

The value of \(M^{(1)}_{i,j}\) should be chosen as small as possible subject to \(x_j-1+M^{(1)}_{i,j}\ge x^{up}_i\). This means \(M^{(1)}_{i,j}\ge x^{up}_i +1 – x_j\). This yields the following optimal value: \(M^{(1)}_{i,j} = x^{up}_i +1 – x^{lo}_j\). Similarly we want to choose the smallest value \(M^{(2)}_{i,j}\) such that \(x_j+1-M^{(2)}_{i,j} \le x^{lo}_i\). This gives us: \(M^{(2)}_{i,j} = x^{up}_j +1 - x^{lo}_i\).

Approach 2

Here we consider the special case where each \(x_i \in \{1,…,n\}\) where \(i=\{1,..,n\}\). Now we can write:

The matrix \(\delta\) can be looked at as a permutation matrix. This permutation matrix \(\delta\) is a row- and column-permuted identity matrix, i.e. it has a single entry equal to one in each row and in each column. Another way of looking at the first two equations is as an assignment problem. Note that we have \(n^2\) binary variables here, but no big-M’s.

We can make one extra simplification: we can drop the first assignment constraint. The resulting equations are:

This simplification in only possible in this special case. We see below that some minor extensions make this simplification fail.

Extension 1

The second approach was tailored to a very special case. Instead of \(x_i \in \{1,…,n\}\) where \(i=\{1,..,n\}\) now consider a slightly more general problem, where \(x_i \in \{a_1,…,a_n\}\) with \(a_{i+1} \gt a_i\) (that is: no duplicates in the set). This problem is easily handled by a simple extension to model II:

would allow solutions like \(x=(4,5,0,0)\) where we actually see both duplicates and inadmissible values.

Extension 2

We can allow explicit duplicates by saying: \(x_i \in \{a_1,…,a_n\}\) with \(a_{i+1} \ge a_i\). E.g. the data array \(a=\{1,2,2,3,3,3\}\) would allow 1 one, 2 twos, and 3 threes. This can be handled directly by model IIa.

Extension 3

Allow a subset. I.e. consider \(x_j \in \{a_1,…,a_n\}\) where \(j=\{1,..,m\}\) and \(m<n\). For example choose two different values \(x_j\) from the set \( \{1,2,3\}\). To model this let’s use \(i\) as the index of \(a_i\). So we have \(j\) being a subset of \(i=\{1,…,n\}\). We need to revise model II as follows:

Saturday, May 28, 2016

Imagine a village with people trading goods. Each person has his own offer in this format: Giving amount a of good b for amount c of good d.

I have made a simple table to point out what I am trying to explain:

In this case there are three different goods: Wood, Sand and Gras.

I also live in the village and noticed that the prices of the traders vary greatly. I have 1 wood and want to increase it by simply trading between the five dealers. But: I must not visit a dealer more than once.

There would be different routes, for example I could do Dave - Adam, which would result in +1 wood for me. A better route would be Earl - Berta - Adam, because it would mean +2 wood.

It is possible to model this with a linear Mixed Integer Programming (MIP) model. I tried this quickly. The optimal solution is:

---- 65 VARIABLE x.L trade

t1 t2 t3 t4

Adam 1Berta 1Carl 1Dave 1

---- 65 VARIABLE y.L number of trades (only integer multiples)

t1 t2 t3 t4

Adam 4Berta 4Carl 4Dave 1

---- 65 VARIABLE inv.L current inventory

t1 t2 t3 t4 t5

wood 4 4sand 4 8gras 4

I.e. we visit Dave,Carl,Berta and Adam (in that sequence). The final inventory is 4 pieces of wood. In the fifth period we don't do anything.

The complete MIP model is below. It is built around an assignment problem structure, which makes sure we visit at most one dealer each period, and that we do not revisit dealers.

Extensions:

The data for a trades can be fractional (1.3 grass for 2.4 wood).

The multiples of a trade are in the model restricted to integers. We can easily allow any trade size > 0 by making y a continuous variable. We may need to change equation xy2 to \(y_{p,t} \ge 0.01 x_{p,t}\).

We can easily apply limits on each trade, in the form of a capacity constraint. We already applied a fixed upper bound of 100 on each trade. This can be made part of the input instead (and can be made different for each trade).

I want to import the data for a three-dimensional parameter p(i,j,k) that is stored in in k excel sheets but GAMS does not let me use dollar control statements in loops. Is there any way to do that using loops or other flow control statements like 'for' or 'while'?

Friday, May 27, 2016

a large number of meta-heuristics based loosely on Nature are described. A list is: Simulated Annealing, Genetic Algorithms, Differential Evolution, Particle Swam Optimization, Firefly Algorithms, Cuckoo Search, Bat Algorithms, Flower Pollination Algorithms, But there are actually more to be found:

Monday, May 23, 2016

A major issue in MIP modeling is choosing good values for big-M constants. The poorly chosen name 'big-M' indicates we should use a really big value which is exactly the opposite of what we should do, Here we see an extreme example:

It is my conjecture that just because of this name 'big-M' we have a lot of models ill-behaving as a result of bad numerics. If textbooks just would call this 'small-m' instead, new modelers would not have the urge to use these ridiculously large numbers.

As indicated in the comments some solvers support indicator constraints (Cplex, as well as Scip and Xpress I believe) which allow you to formulate implications without big-M constants.

Thursday, May 19, 2016

A somewhat strange scheduling model was presented to me. We operate a machine in one of several operating modes \(i\). We have time periods \(t\) and the operating cost \(c^{op}_{i,t}\) changes over time. Obviously, the best schedule would be to pick in each period \(t\) the cheapest operating mode. Now we add a changeover cost: when we change from mode \(i\) to mode \(j\) we pay some cost \(c^{ch}_{i,j}\). This would require a real optimization model to find the optimal operating sequence.

We see one changeover between periods 4 and 5. Notice that \(y\) is defined in the model above as:\[y_{i,j,t} = \begin{cases} 1 \> \text{if $x_{i,t}=1$ and $x_{i,t+1}=1$} \\ 0 \> \text{otherwise}\end{cases}\]

This can also be interpreted as a nonlinear constraint \(y_{i,j,t} = x_{i,t}x_{i,t+1}\). In the model we have linearized this. A refinement of the model would have us look at the operating mode just before we start scheduling (i.e. the operating mode in period 0). To handle that easily it is helpful to change the definition of \(y\) to:\[y_{i,j,t} = \begin{cases} 1 \> \text{if $x_{i,t-1}=1$ and $x_{i,t}=1$} \\ 0 \> \text{otherwise}\end{cases}\]

Only equation order needs to change:

Note that \(x0\) is an extremely sparse matrix: it has only one element corresponding to the operating mode for the machine just before period 1. (We used here a GAMS feature: addressing lags outside the domain cause the symbol to be dropped; so \(x_{i,t-1}\) for \(t=period1\) will disappear; in that special case the parameter \(x0\) will kick in). The final results with this initial changeover cost is:

---- 37 VARIABLE x.L operating schedule

period1 period2 period3 period4 period5 period6 period7

mode1 1 1 1mode3 1mode4 1 1 1

---- 37 VARIABLE y.L successor

period4 period5

mode3.mode1 1mode4.mode3 1

---- 37 VARIABLE cost.L = 2.438

Note that the initial mode in period 0 was 4, and now we keep operating using that mode for a little while.

Wednesday, May 18, 2016

An easier method to find all corner points for a system of linear inequalities is to encode the basis using binary variables and then use the Cplex solution pool. Here we used a loop where we added binary cuts to prevent an earlier found basis to be revisited again. A complete description of the problem can be found in that post also. With the Cplex solution pool we just use one solve and we don’t have to bother with specifying the cuts. In addition we expect to see much of a performance improvement. So how does performance stack up with the loop?

As we can see the performance of the solution pool is superior compared to the loop implementation. Note: it will give the same solution (although in a different order).

set b 'basis status'/NBLO'nonbasic at lower bound'NBUP'nonbasic at upper bound'B'basic'/;binaryvariablesbs(k,b) 'basis status of each row and column';

equationsonebs(k)'only basis status is current'countb'number of basics = m'vlower(j)'variable is nb at lower'rlower(i)'row is nb at lower'vupper(j)'variable is nb at upper'rupper(i)'row is nb at upper';

In this post (a followup to this one), I want to find all corner points related to these inequalities:\[\boxed{\begin{align} &3 \le 5x_1+3x_2+3x_3+5x_4 \le 5 \\ &0 \le x_2+x_3 \le 1 \\ &0 \le x_1+x_4 \le 1 \\ &1 \le 3x_3+5x_4 \le 3 \\ &0 \le 5x_1+3x_2 \le 2 \\ &0 \le x_1,x_2,x_3,x_4 \le 1 \end{align}}\]The approach I follow is based on this post. That is, we use binary variables to explicitly encode the LP basis, and then solve a series of MIP models where we add cuts to the model to exclude the currently found basis. There are two main differences with the model shown in the original post:

The current problem has no objective. This is not a major hurdle: we can add a dummy objective. The algorithm becomes actually slightly easier as we don't have to check if the objective starts to deteriorate.

The current problem has bounded variables. In the original problem we assumed we have positive variables without an upper bound. That meant we could use a single binary variable to encode the basis status: a variable is basic or non-basic. In the current case we could do the same thing and implement the bounds as explicit constraints. However, instead I choose for a slightly more sophisticated approach: we allow the basis status to assume three different values: basic, non-basic at lower bound and non-basic at upper bound. This is exactly how Simplex based linear programming solvers deal with bounded variables.

Positive variables/slacks

Let’s recapitulate what we did in the case of positive variables. When we have positive variables (or slacks) \(x_k\ge 0\), we can use the following basis encoding:\[\beta_k = \begin{cases} 1 \> \text{if \(x_k\) is basic}\\ 0 \> \text{if \(x_k\) is non-basic} \end{cases}\]If a variable is non-basic it should be equal to zero. This is implemented as\[0 \le x_k \le M \beta_k\]In addition we know that we have \(m\) (number of LP rows) basics:\[\sum_k \beta_k = m\]

Solution

After running the model we have a large number of feasible bases, 767 to be precise. Some partial results are listed below:

The top part are the values of the columns and rows. The bottom part shows the basis status. We see we get some duplicate values (they have a different basis, but the same values; this is called degeneracy in linear programming).

We can easily add some code to the model to find the unique values and filter out the duplicates:

There are only 20 unique corner point solutions.

We had to run 768 models for that (the last model was infeasible so we stopped), and every model became larger by one constraint. That also means models are becoming more expensive to solve. This is illustrated in this graph: both the number of Simplex Iterations needed to solve each model (orange line) and the solution times (blue line) show an increase over time.

set b 'basis status'/NBLO'nonbasic at lower bound'NBUP'nonbasic at upper bound'B'basic'/;binaryvariablesbs(k,b) 'basis status of each row and column';

equationsonebs(k)'only basis status is current'countb'number of basics = m'vlower(j)'variable is nb at lower'rlower(i)'row is nb at lower'vupper(j)'variable is nb at upper'rupper(i)'row is nb at upper';