Lately my kids have been interested in puzzles of this type: You are
given a sequence of four digits, say 1,2,3,4, and your job is to
combine them with ordinary arithmetic operations (+, -, ×, and ÷) in any order to
make a target number, typically 24. For example, with 1,2,3,4, you
can go with $$((1+2)+3)×4 = 24$$ or with $$4×((2×3)×1) = 24.$$

I said I had found an unusually difficult puzzle of this type, which
is to make 2,5,6,6 total to 17. This is rather difficult. (I will
reveal the solution later in this article.) Several people
independently wrote to advise me that it is even more difficult to
make 3,3,8,8 total to 24. They were right; it is amazingly difficult.
After a couple of weeks I finally gave up and asked the computer, and
when I saw the answer I didn't feel bad that I hadn't gotten it
myself. (The solution is here if you want to give up without writing
a program.)

From now on I will abbreviate the two puzzles of the previous
paragraph as «2 5 6 6 ⇒ 17» and «3 3 8 8 ⇒ 24», and others similarly.

The article also inspired a number of people to write their own
solvers and send them to me, and comparing them was interesting. My
solver followed the tree search technique that I described in
chapter 5 of Higher-Order Perl,
and which has become so familiar to me that by now I can implement it
without thinking about it very hard:

Invent a data structure that represents the state of a
possibly-incomplete search. This is just a list of the stuff one
needs to keep track of while searching. (Let's call this a
node.)

Build a function which recognizes when a node represents a
successful search.

Build a function which takes a node, computes all the ways the
search could proceed from that point, and returns a list of nodes
for those slightly-more-advanced searches.

Initialize a queue with a node representing a search that
has just begun.

This is precisely a breadth-first search. To make it into depth-first
search, replace the queue with a stack. To make a heuristically
directed search, replace get_next with a function that looks at the
queue and chooses the best-looking node from which to proceed. Many
other variations are possible, which is the advantage of this
synthetic approach over letting the search arise organically from a
recursive searcher. (Higher-Order Perl says “Recursive functions
naturally perform depth-first
searches.” (page
203)) In Python or Ruby one would be able to use yield and would
not have to manage the queue explicitly, but in this case the queue
management is trivial.

In my solver, each node contains a list of available
expressions, annotated with its numerical value. Initially, the
expressions are single numbers and the values are the same, say

[ [ "2" => 2 ], [ "3" => 3 ], [ "4" => 4 ], [ "6" => 6 ] ]

Whether you represent expressions as strings or as something more
structured depends on what you need to do with them at the end. If
you just need to print them out, strings are good enough and are easy
to handle.

A node represents a successful search if it contains only a single
expression and if the expression's value is the target sum, say 24:

[ [ "(((6÷2)+3)×4)" => 24 ] ]

From a node, the search should proceed by selecting two of
the expressions, removing them from the node, selecting a legal
operation, combining the two expressions into a single expression, and
inserting the result back into the node. For example, from the
initial node shown above, the search might continue by subtracting the
fourth expression from the second:

[ [ "2" => 2 ], [ "4" => 4 ], [ "(3-6)" => -3 ] ]

or by multiplying the second and the third:

[ [ "2" => 2 ], [ "(3×4)" => 12 ], [ "6" => 6 ] ]

When the program encounters that first node it will construct both of
these, and many others, and put them all into the queue to be
investigated later.

From

[ [ "2" => 2 ], [ "(3×4)" => 12 ], [ "6" => 6 ] ]

the search might proceed by dividing the first expression by the third:

[ [ "(3×4)" => 12 ], [ "(2÷6)" => 1/3 ] ]

Then perhaps by subtracting the first from the second:

[ [ "((2÷6)-(3×4))" => -35/3 ] ]

From here there is no way to proceed, so when this node is removed
from the queue, nothing is added to replace it. Had it been a winner,
it would have been printed out, but since !!-\frac{35}3!! is not the target
value of 24, it is silently discarded.

To solve a puzzle of the «a b c d ⇒ t» sort requires examining a few
thousand nodes. On modern hardware this takes approximately zero
seconds.

The actual code for my solver is a lot of Perl gobbledygook that may
not be of general interest so I will provide a link for people who are
interested in deciphering it. It also represents my second attempt: I
lost the code that I described in the earlier article and had to
rewrite it. It is rather bigger than I would have liked.

Stuff goes wrong

People showed me a lot of programs to solve this, and many didn't
work. There are a few hard cases that several of them get wrong.

Fractions

Some puzzles require that some subexpressions have fractional values.
Many of the programs people showed me used integer arithmetic
(sometimes implicitly and unintentionally) and failed to solve those
puzzles. We can detect this by asking for a solution to «2 5 6 6 ⇒
17», which requires a fraction. The solution is !!6×(2+(5÷6))!!. A
program using integer arithmetic will calculate !!5÷6 = 0!! and fail
to recognize the solution.

Several people on Twitter made this mistake and then mistakenly
claimed that there was no solution at all. Usually it was possible to
correct their programs by changing

inputs = [ 2, 2, 5, 6 ]

to

inputs = [ 2.0, 2.0, 5.0, 6.0 ]

or something like that.

Some people also surprised me by claiming that I had lied when I
stated that the puzzle could be solved without any “underhanded
tricks”, and that the use of intermediate fractions was itself an
underhanded trick. Your Honor, I plead not guilty. I originally
described the puzzle this way:

You are given a sequence of four digits, say 1,2,3,4, and your job is
to combine them with ordinary arithmetic operations (+, -, ×, and ÷)
in any order to make a target number, typically 24.

The objectors are implicitly claiming that when you combine 5 and 6
with the “ordinary arithmetic operation” of division, you get
something other than !!\frac56!!. This is an indefensible claim.

I wasn't even trying to be tricky! It never occurred to me that
fractions were something that some people would consider underhanded,
and now that it has been suggested, I reject the suggestion. Folks,
the result of division can be a fraction. Fractions are not some sort
of obscure mathematical pettifoggery. They have been with us for at
least 3,500 years now, so it is time everyone got used to them.

Floating-point error

Some programs used floating-point arithmetic to deal with the fractions and
then fell foul of floating-point error. I will defer discussion of
this to a future article.

[ Addendum 20170825: Looking back on our old discussion from July 2016, I see that Lindsey Kuper said to me:

One nice thing about using Racket or Scheme is that it handles the
numeric stuff so nicely. If you weren't careful, I could imagine in
Python a solution failing because it evaluated to
16.99999999999999997 or something.

Good call, Dr. Kuper! ]

Expression construction

A more subtle error that several programs made was to assume that all
expressions can be constructed by combining a previous expression with
a single input number. For example, to solve «2 3 5 7 ⇒ 24», you
multiply 3 by 7 to get 21, then add 5 to get 26, then subtract 2 to
get 24.

But not every puzzle can be solved this way. Consider «2 3 5 7 ⇒ 41».
You start by multiplying 2 by 3 to get 6, but if you try to combine
the 6 with either 5 or 7 at this point you will lose. The only
solution is to put the 6 aside and multiply 5 by 7 to
get 35. Then add the 6 and the 35 to get 41.

Another way to put this is that an unordered binary tree with 4 leaves
can take two different shapes. (Imagine filling the green circles
with numbers and the pink squares with operators.)

The right-hand type of structure is sometimes necessary, as with «2 3
5 7 ⇒ 41». But several of the proposed solutions produced only
expressions with structures like that on the left.

You can see the problem in the last line. a, b, c, and d are
numbers, and u, v, and w are operators. The program evaluates
an expression to see if it has the value 17, but the expression always
has the left-hand shape. (The program has another limitation: it
never uses the same operator twice in the expression. That second
permutations should be (sequence . take 3 . repeat) or
something. It can still solve «2 5 6 6 ⇒ 17», however.)

Often the way these programs worked was to generate every possible
permutation of the inputs and then apply the operators to the input
lists stackwise: pop the first two values, combine them, push the
result, and repeat. Here's a relevant excerpt from
a program by Tim Dierks,
this time in Python:

Division by zero

A less common error exhibited by some programs was a failure to
properly deal with division by zero. «2 5 6 6 ⇒ 17» has a solution,
and if a program dies while checking !!2+(5÷(6-6))!! and doesn't find
the solution, that's a bug.

Programs that worked

Ingo Blechschmidt (Haskell)

Ingo Blechschmidt showed me a solution in
Haskell. The code is quite short.
M. Blechschmidt's program defines a synthetic expression type and an
evaluator for it. It defines a function arb which transforms an
ordered list of numbers into a list of all possible expressions over
those numbers. Reordering the list is taken care of earlier, by
Data.List.permutations.

Having made up our own synonyms for the arithmetic operators (Sum for
!!+!!, etc.) we now have to explain to Haskell what they mean. (“Not
expressions, but an incredible simulation!”)

I spent a while trying to shorten the code by using a less artificial
expression type:

data Exp a
= Lit a
| Op ((a -> a -> a), String) (Exp a) (Exp a)

but I was disappointed; I was only able to cut it down by 18%, from 34
lines to 28. I hope to discuss this in a future article. By the way,
“Blechschmidt” is German for “tinsmith”.

Shreevatsa R. (Python)

Shreevatsa R. showed me a solution in Python.
It generates every possible expression and prints it out with its
value. If you want to filter the voluminous output for a particular
target value, you do that later. Shreevatsa wrote up
an extensive blog article about this
which also includes a discussion about eliminating duplicate
expressions from the output. This is a very interesting topic, and I
have a lot to say about it, so I will discuss it in a future article.

Jeff Fowler (Ruby)

Jeff Fowler of the Recurse Center wrote a compact solution in
Ruby that he described as “hot
garbage”. Did I say something earlier about Perl gobbledygook? It's
nice that Ruby is able to match Perl's level of gobbledygookitude.
This one seems to get everything right, but it fails mysteriously if I
replace the floating-point constants with integer constants. He did
provide a version that was not “egregiously minified” but I don't have
it handy.

Lindsey Kuper (Scheme)

Lindsey Kuper wrote a series of solutions in the Racket dialect of
Scheme, and discussed them on her
blog
along with some other people’s work.

M. Kuper's first draft was 92 lines long (counting whitespace) and
when I saw it I said “Gosh, that is way too much code” and tried
writing my own in Scheme. It was about the same size. (My Perl
solution is also not significantly smaller.)

Martin Janecke (PHP)

I saved the best for last. Martin Janecke showed me an almost flawless
solution in PHP that uses a completely different approach than
anyone else's program. Instead of writing a lot of code for generating
permutations of the input, M. Janecke just hardcoded them:

(I don't think those templates are all necessary, but hey, whatever.)
Finally, another set of nested loops matches each ordering of the
input numbers with each selection of operators, uses sprintf to plug
the numbers and operators into each possible expression template, and
uses @eval to evaluate the resulting expression to see if it has the
right value:

If loving this is wrong, I don't want to be right. It certainly
satisfies Larry Wall's criterion of solving the problem before your
boss fires you. The same approach is possible in most reasonable
languages, and some unreasonable ones, but not in Haskell, which was
specifically constructed to make this approach as difficult as possible.

M. Janecke wrote up a blog article about this, in
German. He says “It's not an elegant
program and PHP is probably not an obvious choice for arithmetic
puzzles, but I think it works.” Indeed it does. Note that the use of
@eval traps the division-by-zero exceptions, but unfortunately falls
foul of floating-point roundoff errors.

Thanks

Thanks to everyone who discussed this with me. In addition to the
people above, thanks to Stephen Tu, Smylers, Michael Malis, Kyle
Littler, Jesse Chen, Darius Bacon, Michael Robert Arntzenius, and
anyone else I forgot. (If I forgot you and you want me to add you to
this list, please drop me a note.)

Coming up

I have enough material for at least three or four more articles about
this that I hope to publish here in the coming weeks.