Tuesday, July 15, 2014

Fun and Games with CRV: The Zebra Puzzle

The Zebra Puzzle is a classic logic puzzle, first published by Life International in 1962. Older versions of it exist and it is also sometimes called Einstein's puzzle. According to "Of Camels and Committees" by Tom Fitzpatrick and Dave Rich, it was used in the early days of constrained random verification (CRV) to show off EDA tools at marketing events.

The puzzle consists of a set of fifteen statements:

There are five houses.

The Englishman lives in the red house.

The Spaniard owns the dog.

Coffee is drunk in the green house.

The Ukrainian drinks tea.

The green house is immediately to the right of the ivory house.

The Old Gold smoker owns snails.

Kools are smoked in the yellow house.

Milk is drunk in the middle house.

The Norwegian lives in the first house.

The man who smokes Chesterfields lives in the house next to the man with the fox.

Kools are smoked in the house next to the house where the horse is kept.

The Lucky Strike smoker drinks orange juice.

The Japanese smokes Parliaments.

The Norwegian lives next to the blue house.

(Life International, 1962)

Based on these statements, the reader is asked to deduce who drinks water and who owns the zebra.

Deductive reasoning makes my head hurt, so I want to have the computer solve it for me. As with Sudoku, we don't want to build an algorithm that finds the solution. This is exactly the kind of problem constraint programming is perfect for: expressing logical relations between variables. Let's dive right in!

The first step is to formalize the problem. We know that the houses are of different colors, contain people of specific nationalities that drink some things, smoke some cigarette brands and own some pets, again all different. We will model all of these as enumerations.

The second step is to write the starting statements as constraints. We can't know on which instance of the house to apply a specific statement; this is what we want to find out. The statements must apply to all houses (with the exception of a few that apply to a specific house), which means that for the most part we will write foreach constraints.

Statement 1 just says that there are five houses. This isn't a constraint per se, but more of a modeling topic. We model this by declaring our neighborhood as a vector of houses that contains five elements:

Statement 6 is a bit tricky. It tells us that the green house is immediately to the right of the ivory house. This means that we have to take care in our foreach constraint that we don't end up outside the neighborhood (i.e. access a non-existing location in our vector):

constraintstatement6_c{foreach(house[i])if(i<4)// make sure we don't go out of boundshouse[i].color==IVORY->house[i+1].color==GREEN;}

Moving on, statement 7 informs us that the Old Gold smoker owns snails:

Statement 10 also only applies to only one house, the first one, and tells us that the Norwegian lives there:

constraintstatement10_c{house[0].nationality==NORWEGIAN;// no. 0 is the first house}

Statement 11 becomes tricky again. From it we know that the Chesterfield smoker lives next to the fox owner. "Next to" means either to the left or to the right. As with statement 6, we have to make sure we don't fall off of the neighborhood map. We have three cases to consider. If he lives in the first house (the leftmost) then this means his neighbor to the right owns the fox. Likewise, if he lives in the last house (the rightmost), this means that his neighbor to the left owns the fox. However, if he lives in one of the middle houses, then either one of his neighbors could potentially own the fox. Expressed as constraints, this would look like this:

Statement 15 is again more involved, finally telling us that the Norwegian lives next to the blue house. We could type a bit less here (because we know the Norwegian lives in the first house) and directly say that the second house is blue, but I want the constraint solver to work hard for its money. Let's just write the constraint as we did for statements 11 and 12:

Ideally, the third step should be to just run our code and find the solution, but as always nothing works right on the first try. I kept getting some very weird results at first, that were completely off. The reason for this is that we used implication constraints. Saying that being English implies you live in the red house is not enough. We don't know the order in which fields are assigned by the solver, so we also need to say that if you live in the red house, then you are English. Luckily, SystemVerilog provides us the equivalence operator <->. Here's how the constraint for statement 2 should really look like:

Even with the equivalence constraints in place, I got three Japanese and two Norwegians in my neighborhood. I looked at each of the constraints and they all applied (though some vacuously). The last one, however (the one I chose to write long), didn't, but only because the compiler had no problem with me saying that the Norwegian lives next to the house with the color fox (you won't see it in the code I posted as I already fixed that). Surprisingly, there was no error, no warning, no nothing, even though such a conversion is illegal. Let this be a lesson to us for the future: some simulators are a bit more relaxed with their interpretation of the SystemVerilog standard and we need to make sure to enforce strict LRM compliance, otherwise we're going to have a bad time debugging implicit type conversions.

Even after fixing the foxy house, I still got a neighborhood full of Norwegians with all blue houses. Three of them smoked Chesterfields, while the other two liked Old Gold (and had snails as well, of course). Most drank milk and only one drank water. Also, the middle guy had the zebra. You get the idea, something was still missing. Well, the wording of the puzzle is important. It always said "the Norwegian..." or "the Lucky Strike smoker..." which means there is only one Norwegian and only one Lucky Strike smoker. The original article in Life International adds this fact as a clarification: "[It] must be added that each of the five houses is painted a different color, and their inhabitants are of different national extractions, own different pets, drink different beverages and smoke different brands of American cigarets [sic]."

To reflect this, we just need to add a uniqueness constraint. Unfortunately, because constraint operands have to be of integral types, we can't just say that the houses are unique. We have to declare auxiliary lists for nationalities, pets, etc. and constrain these to be unique. In e we could have accessed these directly, without the need for extra fields, but here is how it looks like in SystemVerilog:

Even with the uniqueness constraint in place, we still get a contradiction error. This one took me a bit to figure out. The problem is with our "complicated" constraints (the ones for statements 11, 12 and 15). Here is how the constraint for statement 11 looks after swapping implication for equivalence, in particular the constraint for the middle houses:

The mistake is with respect to the relation between the equivalence and logical operators. Let's take a step back and let me explain what I mean by that. For the case of implication, the following is true:

a -> (b || c) == (a -> b) || (a -> c)

The same is however not true for equivalence:

a <-> (b || c) != (a <-> b) || (a <-> c)

For all of you math enthusiasts, this means that equivalence is not distributive over disjunction. If you write the truth tables you will see that this is the case. What we're interested in is the right hand side of the expression (the neighbor on the left smokes Chesterfields or the neighbor on the right smokes Chesterfields), but what we have implemented is the expression on the left hand side. Now that we know this, we can fix the constraints (shown for statement 11):

Technically, more correct would have been to say A <-> B xor A <-> C, because the neighbor can only be either on the left or on the right. This might speed things up a bit and might also make it so that uniqueness constraints are not needed. Unfortunately, we can't try this out as there is no logical exclusive "or" operator in SystemVerilog.

After fixing this last thing, everything now works! We find out that the Norwegian drinks water and that the Japanese owns the zebra. Much easier than deductive reasoning...

I hope you enjoyed this post. It was a lot of fun to make and I learned a bit more about the constraint capabilities of SystemVerilog. If you want to play with the solver, you can download it from SourceForge.

8 comments:

"^" is not officially a logical operator, it's a binary bitwise operator. I've tried it out just now and it works in constraints, but what we have to be careful with here is the operator precedence. "^" has a higher precedence than either "&&" or "||".

As an example, let's assume A, B and C are expressions and that we have the following constraint: A && B || A && C. This is equivalent to (A && B) || (A && C).

If however we want to swap out the "||" for "^" and we rewrite the constraint as A && B ^ A && C, then this will be equivalent to A && (B ^ A) && C, due to the higher precedence of the "^" operator. This would have to be rewritten as: (A && B) ^ (A && C) to force the order of operations that we want.

Nice example. I did the similar exercise, using SystemVerilog, but in order to have it solved by formal verification (model checking).The code is more concise since we don't care about the pitfalls you mentioned.A commercial model checker is able to find the solution immediately, AND TO PROVE IT IS UNIQUE.I think this shows the superiority of formal methods.

What's the running time for the constrained random approach? How would it extend if the number of houses gets bigger? Probably very badly.

Each constraint should ideally just reduce the state space every time. From what I know, constraint solvers are also at their core built from SAT solvers, the same as formal tools. I wouldn't think that a constraint solver gets less mileage than a formal tool if the number of houses gets bigger. The reason why I could guess a formal tool would be faster in this respect is because it has a better optimized prover (as this is the main thing it does), whereas for a logic simulator there are for sure other things that also need to be optimized (the simulation engine for example) and they won't invest so much time in developing their constraint solver (as long as it's not incredibly slow).

Take what I' said with a grain of salt, as I don't really know the internal implementations of EDA tools.

HI VERY NICE EXPLONATION. BUT THE SAME CODE, I CODED IN VCS. I RANDOMIZE 20 TIMES AND PRINTING EVERY HOUSE RESULT. IDEALLY, WE SHOULD GET THE SAME RESULT. BUT IT IS GIVING DIFFERENT RESULTS, IN WHICH CORRECT RESULT IS ONE OR TWO INSTANCES. WHAT WOULD BE THE PROBLEM. Can you help me DAVE?

About

I am a Verification Engineer at Infineon Technologies, where I get the chance to work with both e and SystemVerilog.
I started the Verification Gentleman blog to store solutions to small (and big) problems I've faced in my day to day work. I want to share them with the community in the hope that they may be useful to someone else.