LINQ to Z3 – Theorem Solving on Steroids – Part 1

Introduction

It’s way too long ago I wrote about this side-project of mine, as I got side-tracked by other stuff both inside and outside the realm of LINQ (more about that some other time around). Last time, I showed how to put “the query pattern” to our hand by providing an implementation for the Where operator on top of some Theorem<T> type. Let’s recap that to set the scene:

We explained how the non-generic base type keeps track of a set of constraints that have been declared by the user through use of the where query operator, turned into expression trees by the Where signature’s first parameter. This allows us to define constraints in a manner that’s easy on the eye:

The Symbols generic type is one way to create a set of symbols used within the declared theorem’s constraint (and to print the result if the constraints can be satisfied).

From the above you should be able to get an idea about the LambdaExpression instances that are kept in the Theorem<T>’s base type’s instance field, as collected through the successive calls to Where. Remaining question is how to turn the defined theorem into action, something that’s triggered through the call to Solve. This mystery is what we’ll reveal today.

Three steps to solve a theorem

LINQ to Z3 is in fact nothing more but a runtime translator of LINQ expression trees into Z3 calls. The Solve method lies at the heart of this translation and acts as the entry-point for the user. It’s similar to the nature of GetEnumerator in a typical LINQ provider, causing a translation of the query’s LINQ expression tree into the underlying query language (e.g. SQL, CAML, etc).

Three steps take place to make this work:

First of all, we need to establish an environment with the theorem’s symbols and inform Z3 about those. In other words, we map things like X1 to a Z3 counterpart, capturing the name and the type of the symbol (e.g. “T1”, int). We keep track of the Z3 targets (in fact those will be IntPtr pointers, see further) so that subsequent phases of the translation can refer to those when asserting constraints.

Secondly, constraints expressed through LINQ need to be translated into Z3 constraints, based on the established environment above. We call this phase the assert constraints phase which is implemented as an expression tree visitor pattern. Here the bulk of the implementation of LINQ to Z3 lives.

Finally, Z3 is queried for a model that satisfied the asserted constraints. If such a model exists, it’s still not ready for consumption by the end-user. To expose the solution in a user-friendly way, the symbols from the environment are mapped onto a .NET object that contains the values satisfying the expressed constraints. This phase is referred to as the get solution phase.

All of the above translates into the following implementation of Solve, on the Theorem base class:

Recall from the previous post that _context is a Z3Context object that wraps the Z3 Config type. All we do there is creating a Z3 Context object which is the Z3 primitive to start defining a theorem on. Let’s have a brief look at what Z3’s Context type provides (simplified):

As you can see, the way to communicate with the Z3 library is through IntPtr objects. Recall the managed Z3 library is a simple wrapper on top of the native library and the “handle” mechanism to represent Z3 objects is surfaced through IntPtr objects (corresponding to pointers ultimately). Various Mk* functions exist to build up expressions that are used to represent constraints. For example, one could write:

This asserts the constraint “a && b” (in C# lingo), where a and b are Boolean values, to the Z3 context. Ultimately we can call CheckAndGetModel to see whether the constraints can be satisfied, and if so a Model is passed back to us:

Either we print the whole model or we evaluate expressions against the Model (amongst other options). Below is a sample of how a and b can be retrieved. Notice how cA and cB are used again to query the model:

To understand the above, start from the method’s contract: given a Z3 context, we’re going to return a dictionary that maps PropertyInfo objects onto IntPtr objects representing the symbols introduced on the Z3 context. The code is fairly straightforward to understand and only cares about properties (though one could extend it to allow fields as well). Ignore the TheoremVariableTypeMapping part, which can be dropped from the code for the time being, which will be explained in subsequent posts. Notice we only support Z3’s Bool and Int types.

This was easy, wasn’t it? Next, the meat of the translator: translating and asserting constraints.

Asserting constraints

Asserting the constraints from the LINQ expression trees onto Z3 constraints is the next, and most complicated, step. The entry-point to the method responsible for this looks as follows:

I’m not holding any information back, so some of the above can be ignored for the time being: mask away all of the code with the word rewriter in it, as it will be explained next time around. (To satisfy your curiosity, rewriters allow to create domain-specific constraint languages, e.g. to solve a Kakuro puzzle.)

The essence of the code lives in the foreach-loop that translates the lambda expression-based constraints by a call to the Visit method, returning a Z3 expression for the constraint which is subsequently asserted on the Z3 context. Obviously Visit is where all the magic happens. To do so, it gets four parameters:

The Z3 context, to call Mk* functions on during the translation.

Our environment, to be able to map occurrences of symbols (expressed as properties) onto the introduced Z3 corresponding objects.

And finally, the lambda expression’s parameter expression object (so that calls on it can be recognized, e.g. symbols => … symbols.X1 …).

On to the visitor now. Before we do so, let’s think a bit about the operations we want to support on constraints: arithmetic (+, -, *, /, %), logic (&&, ||, !) and relational (<, <=, >, >=, ==, !=). There’ll be a few more as we shall see in a minute. As usual with expression trees, lots of switch-statements will be attending the party:

Notice the above is made as declarative as possible: most expression tree node types are mapped on either unary or binary specialized visitor methods, passing in a lambda expression that acts as the “Z3 constructor”. For example, And (for & in C#) uses a “(ctx, a, b) => ctx.MkAnd(a, b)” argument. The VisitBinary method will be responsible to recursively translate the left-hand side and right-hand side of the And operation (e.g. x & y, where x and y can be arbitrarily complex BLOCKED EXPRESSION, ultimately feeding the resulting Z3-returned IntPtr objects into the supplied lambda expression.

Next to the VisitBinary and VisitUnary calls, there are specialized calls for the MemberAccess, Constant and Call expression tree nodes. We’ll defer the discussion of those for a moment and focus on binary and unary calls first:

More documentation than anything else :-). Those methods simply visit the expression’s operands (e.g. Left, Right for a binary expression, Operand for a unary one) recursively. The obtained results, of type IntPtr, are fed into the “ctor” delegate parameter to obtain the final result. Trivial stuff.

Now the more complicated visitor methods for members, constants and method calls. First, constants:

We’ve captured the lambda expression parameter “t” in the AssertConstraints method and carry it on throughout the visitor so we can check it’s used as the left-hand side inside constraint expressions.

Note: This restriction limits expressiveness quite a bit. For example, capturing an outer variable gives rise to a closure to be created, which is accessed through field accesses (of type MemberExpression). Such potentially valid uses get rejected by the code above. To restore such functionality, one can employ partial compilation using LambdaExpression.Compile, after checking the expression doesn’t contain the lambda parameter. The essence is shown below:

if (member.Expression != param)
{
if (!IsLambdaParameterFree(member.Expression, param))
throw new NotSupportedException(“Unsupported member expression using constraint parameter.”);

We leave the implementation of such techniques to the reader, together with the task to hook this logic up in al the appropriate places (default case of Visit, in VisitCall further on, etc).

VisitMember finally consults the environment to map the property access back to the Z3 object generated in the GetEnvironment call.

Finally, we have a VisitCall method as well. Once more, I’ve kept some advanced rewriter stuff in here (to be explained in a subsequent post), so ignore that for the time being. The reason we implement VisitCall is to provide a means to expose advanced Z3 constructs efficiently, e.g. the Distinct operator. To do so, a Z3Methods class is provided:

This is very similar to the SqlMethods class of LINQ to SQL, and available for future extensions. VisitCall has the simple task to recognize calls to Z3Methods methods, turning them into the corresponding Z3 primitives.

In particular, a call to Distinct is processed by inspecting its argument corresponding to the params-array, recursively translating any of its elements (e.g. Z3Methods.Distinct(1, 2, s.X1, s.X2) will give rise to two constant expressions and two member accesses, but more complex expressions can occur as the arguments as well…). Ultimately, MkDistinct is called and all is happy.

Now that we’re done investigating the visitor, we can have a look at the final stage of the LINQ to Z3-driven theorem solving, the get solution phase.

Getting the solution

To expose the solution to the user, we new up a Plain Old CLR Object (POCO) of the theorem’s generic parameter type T. All there’s left to do here is to use some reflection voodoo to instantiate T, populate its properties somehow, and return the object. Different situations can occur, based on the way the type is defined. In particular, use of an anonymous type gives some problems as there’s no formal correlation between the generated constructor and the type’s properties (which are get-only, so we can’t assign values directly). This isn’t super-exciting stuff, but nonetheless:

Here we’ve created a domain-specific theorem type, Sudoku, that’s responsible to provide parts of the constraints. In fact, a Sudoku has much more constraints than the ones specified by the user: rows, columns and blocks need to contain unique values from 1-9, and every cell has to be within that range. For the innocent-looking Sudoku above, we get the following set of constraints:

Only the last nine lines were specified by the user through the LINQ query expression. In fact, the reader should be able to create a Sudoku class already based on today’s literature (tip: work your way down starting from the signature of Sudoku.Create you can infer from the sample above; reuse Theorem<T> of course).

More complex puzzles exist though, for example a KenKen puzzle:

Besides constraints on rows and columns and cells, similar to a Sudoku’s (with regards to the use of digits 1 through 4), there are more domain-specific constraints: all mathematical operators are commutative. For example, 1- in the two-cell region means that Cell12 – Cell22 or Cell22 – Cell12 equals to 1. For bigger regions all permutations of cells are considered. In other words, we need different semantics on operators. For example:

Again the last set of constraints are a result of the LINQ query expression (as a side-note, different Z3 techniques are available to tackle problems, something outside the scope of this discussion just yet). In order to generate those constraints, the domain-specific Add, Sub, Mul and Div methods need to be translated into Z3 primitives, something that will be done by implementing a rewriter type (cf. the pieces of code I asked the reader to skim over for the time being). We could go one step further and allow the user to specify the puzzle by means of a visual textual DSL:

We don’t want the user to express things like Cell22 + Cell32 + Cell42 == 23, but instead want to use all the information captured in the puzzle to allow constraints like Cell12.VerticalSum == 23. As we’ll see, this requires other rewriting mechanisms to allow for a good balance of responsibilities between:

Again the red parts reflect what the user expressed, while the remainder is all generated by the domain-specific Kakuro implementation.

Conclusion

Creating a simple LINQ to Z3 implementation isn’t too hard and involves just a little bit of plumbing in the bathroom of reflection and some use of expression tree visitor patterns. In future posts, we’ll have a look at domain-specific theorem solving techniques based on declarative expression tree rewriters. Enjoy!

Comments

YES ive been waiting so long for this :O z3 is like lighning in a bottle but the bottle cap is screwed on way tight :) i tried making a linq-to-z2 wrapper but my expression tree skills where far to weak

i'd like to know how long the big ones took (Sudoku, KenKen, Kakuro). I know that sudokus can be solved by specialized backtracking solvers in milliseconds. Would be nice to see by what factor Z3 is slower.