I am a full-time consultant and provide services related to the design, implementation and deployment of mathematical programming, optimization and data-science applications. I also teach courses and workshops. Usually I cannot blog about projects I am doing, but there are many technical notes I'd like to share. Not in the least so I have an easy way to search and find them again myself. You can reach me at erwin@amsterdamoptimization.com.

The variable \(open_t\in \{0,1\}\) indicates we have an open facility, ready to do business. A facility is considered open after it is built (we don’t close facilities during the planning horizon). To be more precise: it becomes available one period after it has been built.

alternative 3

With this, we now explicitly forbid \(build_{t-1}+open_{t-1}=2\) which was not allowed (implicitly) anyway.

The last formulation seems preferable in terms of simplicity and the number of non-zero elements needed.

As usual with lags we need to be careful what happens near the borders, in this case when \(t=t1\). We just can fix \(open_{t1}=0\). When using GAMS you can even skip that, as indexing with lags outside the domain returns zero. Here that means \(open_{t1} = build_{t1-1} + open_{t1-1} = 0 + 0\). We can also verify this in the equation listing:

Wednesday, December 28, 2016

In (1) a MIP model is proposed so solve the KenKen puzzle. During a discussion, the question came up if I could prove the uniqueness of a solution. In the Mixed Integer Programming model I used a standard formulation for a solution:

A general approach could be to use the technique described in (2): add a cut to forbid the current solution and solve again. If this second solve is infeasible we have established that the original solution was unique.

In this case we can use a more specialized cut that is simpler:

\[\sum_{i,j,k} x^*_{i,j,k} x_{i,j,k} \le n^2-1\]

where \(x^*\) is the previous solution and \(n \times n\) is the size of the puzzle.

To test this with the model and problem data shown in (1) I used:

Note that \(\displaystyle\sum_{i,j,k|x^*_{i,j,k}=1} x_{i,j,k}\) is identical to \(\displaystyle\sum_{i,j,k} x^*_{i,j,k} x_{i,j,k}\). To make sure things work correctly with solution values like 0.9999, I actually used a somewhat generous tolerance: \(\displaystyle\sum_{i,j,k|x^*_{i,j,k}>0.5} x_{i,j,k}\).

Indeed the solution from the first solve was unique. The second solve yielded:

Tuesday, December 20, 2016

In (1) the performance of the SAS/OR MIP solver is discussed. It uses the MIPLIB problem set. Timings from different studies are always difficult to compare, but if we take the 214 or 218 feasible problems, and measure how many of those problems could be solved within in 2 hours we see:

Solver

Number solved within 2 hours (12 threads)

Source

CBC

119 of 218

(3)

SAS/OR

170 of 214

(1)

Cplex

204 of 218

(3)

Gurobi

210 of 218

(3)

I think it is fair to guess that the SAS solver is faster than CBC but a little bit slower than the market leaders Cplex and Gurobi. Again: I don’t know the difference in speed between the machines these benchmarks were executed on, so take this with a grain of salt.

I have quite a few clients with large SAS licenses, but they typically do not use SAS/OR. As a result I have never used the SAS/OR solvers. I usually use SAS as a database. My typical workflow is: export the SAS data, combine with other data, develop and solve the models with other tools and report solutions in Excel.

Sunday, December 18, 2016

MonetDB (1) is a so-called column-oriented database or column-store (2). It is server based, SQL compliant and open-source. The column orientation can lead to better performance for certain types of tasks, especially OLAP and ETL (i.e. analytical work). Traditional row-wise databases are said to be more appropriate for OLTP workloads.

MonetDBLite

There exists a CRAN package to let R talk to a MonetDB server (MonetDB.R). There is also a different package called MonetDBLite. This contains an in-process version of MonetDB. This means MonetDBLite is more or less an alternative for RSQLite (4). A picture comparing the performance of MonetDBLite is from (4):

Basically the more towards the left-lower corner the better.

In (3) there are some timings comparing MonetDBLite to SQLite. E.g. converting a (large) table to a data frame:

There is lots of data traffic from the database to R and the server based MonetDB.R does not finish the larger tables (traffic has to go through a TCP/IP socket). But the embedded database is very fast compared to SQLite.

Saturday, December 17, 2016

This is a well known meta-heuristic (with a somewhat less flashy name than some of the new heuristics that get invented every week). Apparently this is the first book fully dedicated to GRASP. GRASP was introduced in 1989 by Feo and Resende (2).

Chapters:

Introduction

A short tour of combinatorial optimization and computational complexity

Tuesday, December 13, 2016

For a project I was looking at a simple two stage stochastic integer programming problem (1).

I find it always very rewarding to reproduce the results in the paper. Re-implementing the model forces me to not skip the details. Compared to superficial reading it requires to pay attention to all aspects. Of course in many cases journal papers do not allow to reproduce results because crucial details or data are left out. Presenting a stylized model illustrating and illuminating the concepts is only done sporadically, may be because it is actually hard to create such a “simple” model, and it may look less “scientific”.

This EPS means this variable is non-basic with a zero reduced cost. This can indicate the presence of multiple optima. Indeed when using a different MIP solver I see:

---- 145 VARIABLE xe.L stage 2 decisions: emergency options

peak5-6 peak6-8 peak8-10 peak10+

evacuate 1.000 1.000 1.000sandbag 2.000 2.000monitor 20000.000

which is the same as the results in the paper.

Digging a little bit further, we can see that the damage reduction by implementing emergency measure “monitor” is the same as its cost. This means we are indifferent whether to apply this option in this scenario.

Saturday, December 10, 2016

This does not happen a lot, but in some cases we want to sort decision variables. This turns out not such an easy exercise.

When not to use sorting

If we deal with just the \(\min()\) or \(\max()\) function we can do things better than sorting. An important trick is to exploit any so-called convexity so we don’t need to add extra binary variables. Even if the problem is not convex, there is no need to sort things.

Interestingly, minimizing the sum of the K largest values also does not need sorting. See (4).

Sorting a parameter

At first sight this is not very useful exercise: we can sort a parameter – constants in the model – outside of the model. However the problem demonstrates some of the concepts I will use later on. Assume \(a_j\) is our given parameter. A permutation \(y\) of \(a\) can be written as \(y=Pa\) where \(P\) is a permutation matrix (1). A permutation matrix is an identity matrix \(I\) with rows and columns interchanged. Such a matrix has exactly a 1 in each row and in each column. A different way to look at this is as an assignment problem with binary variables \(p_{i,j} \in \{0,1\}\).

Here indices \(i\) and \(j\) are from sets with the same cardinality (\(n\)), i.e. \(p_{i,j}\) is a square matrix. Note that we use \(n^2\) binary variables here, so don’t expect this to work for very large vectors \(a_j\) (2). The same structure can also be used in the context of all-different constraints (3).

Example results

---- 32 PARAMETER a

j1 2, j2 4, j3 1, j4 3

---- 32 VARIABLE p.L

j1 j2 j3 j4

i1 1i2 1i3 1i4 1

---- 32 VARIABLE y.L

i1 4, i2 3, i3 2, i4 1

Sorting a variable

Instead of sorting a parameter \(a_j\), consider the problem of sorting a variable \(x_j\). We cannot just replace \(a_j\) by \(x_j\) as

\[y_i = \sum_j p_{i,j} x_j \]

is non-linear. There is a way to linearize the product \(q_{i,j} = p_{i,j} x_j\) as this is a multiplication of a binary variable with a continuous variable. Assuming \(x_j \in [0,U_j]\), we have:

Example

I'm currently stuck with a MIP program where the interest rate, i, is based on the number of units produced for Housing Plan A. If the number of plan A houses sold is the highest among all four types then i=1. If the number of plan A houses sold is the second highest, then i=2 and so on up to i=4. The interest rate is basically 2i%. Not really sure how to add constraints that will represent the position of plan A houses and implement the correct interest rate in the objective function. The objective function maximizes the total profit (e.g 50,000A + 40,000B + 70,000C + 80,000D). Any ideas on how to use binary variables to represent position?

Friday, December 2, 2016

The new version of Cplex (12.7) has some interesting new facilities to make it easier to use a Benders Decomposition approach (1,3). Paul Rubin shares some experiences with this in (2).

The original paper by Jacques Benders can be found in (4). These old papers are always an interesting read. The style and notation is often very different from how papers look like these days, although in this case the notation is quite modern.

Thursday, December 1, 2016

My wife teaches AM and PM kindergarten classes. AM has 14 students and PM 11. At the beginning of each month, she puts out a new seating chart where she rotates students in such a way that they (ideally) sit at a different table and with different students for that month.

There are 3 students per table, but if numbers force the issue, the last may have more or less. We realize that, by the end of the year, there will be some unavoidable situations where students end up sitting with each other or at the same tables again, and this is okay.

The first set of equations are somewhat straightforward. Each student sits at exactly one table:

\[\sum_t x_{s,t,m} = 1 \>\>\forall s,m\]

We cannot exceed the capacity of a table:

\[\sum_s x_{s,t,m} \le cap_t \>\> \forall t,m\]

In the complete model below I used an equality constraint for this: there is no slack capacity. Secondly we want to keep track of how many times students sit at the same table. We introduce binary variables \(meet_{s1,s2,t,m}\in\{0,1\}\) indicating if two students \(s1\) and \(s2\) sit at the same table. The obvious nonlinear equation would be:

As we push variable \(meet\) down, we can drop the first two inequalities and just keep the last. Note that variable \(meet\) is now a bound instead of the product \(x_{s1,t,m} \cdot x_{s2,t,m}\). When reporting we should use the underlying variables \(x\).

A count of the number of times students \(s1,s2\) sit at the same table is a simple aggregation:

\[meetcount_{s1,s2} = \sum_{t,m} meet_{s1,s2,t,m}\]

An obvious objective is to minimize the expression \(\max meetcount_{s1,s2}\). This is easily linearized:

Model

To test this we use 14 students, 11 months (I assume one month vacation period). The capacities of the 5 tables is 3 except for one table with 2 students (this is to make the capacities sum up to 14).

The equations are exactly the same as discussed above:

To help the solver we fix the table assignment for the first month. In addition I assume student 1 is always sitting at table 1. This makes the model easier to solve. Note that I only fix some variables to 1. As a result a lot of other variables should be zero. E.g. if \(x_{s1,t1,m}=1\) we know that \(x_{s1,t2,m}=0\) and similar for other tables than \(t1\). Instead of fixing all these these myself to zero, I leave it to the MIP presolver to do that for me. This fixing step helps a lot.

We achieve a quick solution with an objective of 2. I.e. two students will sit at most twice at the same table.

Objective

In the above model we minimized the maximum number of times students sit at the same table, The optimal value is 2. However this means that we allow the number of times two students meet (the meet count, “meets” in the tables below) to float between 0 and 2 without a further incentive to decrease the average meet count within this band width. This leads to a distribution as follows:

On average two students sit at the same table with another student 1.57 times. Detail: note that we need to base the calculations on the optimal values of \(x\) instead of \(meetcount\) as \(meetcount\) is using variable \(meet\) which is just a bound on the product \(x_{s1,t,m} \cdot x_{s2,t,m}\).

We can try to bring the average down by minimizing the sum of the meet counts. The results show, this will not actually bring the average down. But the distribution is certainly different. This objective will produce a few student pairs that have a high meet count:

We can try to minimize the number of student pairs meeting twice or more while still pushing down the maximum by using a composite objective:

\[\min maxmeetcount + \frac{count2}{100}\]

where \(count2\) is the number of pairs meeting twice or more. This gives a slightly different distribution:

A depiction of this distribution:

Another interesting objectives:

Use a quadratic cost \(\sum meetcount_{s1,s2}^2\). This will penalize larger meet counts. This objective is somewhere between minimizing the sum and minimizing the max. MIQPs (Mixed Integer Quadratic Programming problems) are often more difficult to solve however compared to linear models.

Try to have students meet each other for the second time only after some months, i.e. separate pairs sitting at the same table twice over time. This is not so simple to model. The constraint to enforce say: at least 2 months between before sitting at the same table again is not so difficult (see below), but really maximizing the space in between is not so easy.

When do we meet again?

When we make a picture of when student pairs sit at the same table we get something like this:

We color coded the cells whether we have 0, 1, 2 or 3 or more months in between (the cells with a ‘3’ indicate that meets are separated by 3 or more months). Note we did not display the pairs that only sit at the same table once.

To forbid the cases 0 and 1 in the above solution we can add the constraints:

I never heard of this before, but here is a Matlab application. Looks like “Brain Storm Optimization” is yet another heuristic. (Not so easy to keep track, every academic seems to have the urge to invent his/her own heuristic these days).