Category Archives: Algorithms

Post navigation

In the 2016 Year Game, the task was to generate the numbers 0 to 100 using APL primitives and the digits 2 0 1 6 in that order. For example,

20=16
×2016
2⌊016
2+×016
…

This “puzzle of the year” brings to mind Krypto, a game I played many years ago while in grade school.

Krypto

Krypto is a mathematical card game. The Krypto deck has 56 cards: 3 each of numbers 1-6, 4 each of the numbers 7-10, 2 each of 11-17, 1 each of 18-25.

⎕io←0
DECK ← (3/1+⍳6),(4/7+⍳4),(2/11+⍳7),18+⍳8

Six cards are dealt: an objective card and five other cards. A player must use all five of the latter cards’ numbers exactly once, using any combination of arithmetic operations (+, -, ×, and ÷) to form the objective card’s number. The first player to come up with a correct formula is the winner. The stricter “International Rules” specify the use of non-negative integers only; fractional or negative intermediate results are not permitted.

For example, if the objective card were 17 and the other cards were 2, 8, 14, 19, and 21, then a Krypto solution can be as follows. Without loss of generality, we use APL notation and syntax.

8 - 19 + 14 - 2 × 21

In this article we present functions to find all solutions to a Krypto hand.

A Solution

There are a maximum of !5 permutations of the 5 cards and 4 possibilities in each of the 4 places where an operation can be put, for (!5)×4*4 or 30720 total possibilities. This number is small enough for an exhaustive approach.

Intermediate Steps

The function perm is from the Dyalog blog post Permutations, 2015-07-20. perm n generates the sorted table of all permutations of ⍳n.

The local variable a in Krypto is a 30720-row character matrix computed from the 5 non-objective numbers. It consists of all !5 permutations of the 5 numbers interspersed with all 4*4 arrangements of the four operations + - × ÷.

Executing the rows of a produces a 30720-element numeric vector. Comparison of this vector with the objective yields a boolean vector that selects the rows of a which are correct formulas.

We note that use of the @ operator, new to Dyalog version 16.0, obviates the need to name intermediate results for reasons for syntax. Instead, names are only used to break the code down into more comprehensible components.

Finally, the last accessory I ordered for my Raspberry Pi Zero (that’s the little red thing behind my keyboard) has arrived – an Acer 43″ ET430K monitor. The Zero won’t quite drive this monitor at its maximum resolution of 3840×2160 pixels, but as you can see, you get enough real estate to do real graphics with “ASCII Art” (well, Unicode Art, anyway)!

Some readers will recognise the image as the famous Mandelbrot Set, named after the mathematician Benoit Mandelbrot, who studied fractals. According to Wikipedia a fractal is a mathematical set that exhibits a repeating pattern displayed at every scale – they are interesting because they produce patterns that are very similar to things we can see around us – both in living organisms and landscapes – they seem to be part of the fundamental fabric of the universe.

Fractals are are also interesting because they produce images of staggering beauty, as you can see on the pages linked to above and one of our rotating banner pages, where a one-line form of the APL expression which produced the image is embedded:

The colourful images of the Mandelbrot set are produced by looking at a selection of points in the complex plane. For each point c, start with a value of 0 for z, repeat the computation z = c + z², and colour the point according to the number of iterations required before the function becomes “unbounded in value”.

In APL, the function for each iteration can be expressed as {c+⍵*2}, where c is the point and ⍵ (the right argument of the function) is z (initially 0, and subsequently the result of the previous iteration):

If you are not familiar with complex numbers, those results may look a bit odd. While complex addition just requires adding the real and the imaginary parts independently, the result of multiplication of two complex numbers (a + bi) and (c + di) is defined as (ac-db) + (bc+ad)i. Geometrically, complex multiplication is a combined stretching and rotation operation.

Typing all those f’s gets a bit tedious, fortunately APL has a power operator ⍣, a second-order function that can be used to repeatedly apply a function. We can also compute the magnitude of the result number using |:

(|(f⍣6)0)(|(f⍣7)0)
8.853E7 7.837E15

Let’s take a look at what happens if we choose a point inside the Mandelbrot set:

Above, I used an anonymous function so I could pass the number of iterations in as a parameter, and use the each operator ¨ to generate all the results up to ten iterations. In this case, we can see that the magnitude of the result stabilises, which is why the point 0.1 + 0.1i is considered to be inside the Mandelbrot set.

Points which are just outside the set will require varying numbers of applications of f before the magnitude “explodes”, and if you colour points according to how many iterations are needed and pick interesting areas along the edge of the set, great beauty is revealed.

Our next task is to create array of points in the complex plane. A helper function unitstep generates values between zero and 1 with a desired number of steps, so I can vary the resolution and size of the image. Using it, I can create two ranges, multiply one of them by i (0J1 in APL) and use an addition table (∘.+) to generate the array:

The result is subtracted from 0.7J0.5 (0.7 + 0.5i) to get the origin of the complex plane slightly off centre in the middle, and multiplied the whole thing by 3 to get a set of values that brackets the Mandelbrot set.

Note that APL expressions are typically rank and shape invariant, so our function f can be applied to the entire array without changes. Since our goal is only to produce ASCII art, we don’t need to count iterations, we can just compare the magnitude of the result with 9 to decide whether the point has escaped:

We can see that points around the edge start escaping after 3 iterations. Using an anonymous function again, we can observe more and more points escape, until we recognise the (very low resolution) outline of the Mandelbrot set:

The last example allowed me to sneak in a preview of the new “at” operator coming in Dyalog v16.0. It is a functional merge operator that I am using to insert the formatted right argument (the number of iterations) into position (0 0) of each matrix.

If I use our remote IDE (RIDE) on my Windows machine and connect to the Pi, I can have an APL session on the Pi with 3840×2160 resolution. In this example, I experimented with grey scale “colouring” the result by rounding down and capping the result at 10, and using the character ⍟ (darkest) for 0, ○ for 1-5, × for 6-9, and blank for points that “escaped”, by indexing into an array of ten characters:

Retro is back in style – but of course I should point out that we can produce “proper” graphics using SharpPlot, a cross-platform graphics package that is included with Dyalog APL on all platforms. I’ve enhanced the ReactionTimer namespace with a function called SPHistogram. If you are using RIDE as the front end to APL on the Pi, this will render output from SharpPlot in a window:

ReactionTimer.SPHistogram times

The original ASCII chart simply plotted the observations in the order that they occurred. Above, the shaded areas show how many observations there were in each bucket (2 between 250ms and 275ms, 3 between 275ms and 300ms, and so on). At the same time, the individual observations can be seen along the X axis as vertical red lines. It is a little unfortunate that, presumably due to some artifact of the timing process, the values 251 and 294 occur twice – and these lines are drawn on top of each other.

The example highlights one of the things that makes SharpPlot a bit special: We have overlaid a histogram showing the frequency of reactions in each bucket, and used a “scatterplot” to where the Y value is always 0, to mark the individual observations along the X axis.

Hopefully the code is more or less self-explanatory, but if you’d like to learn more about SharpPlot there is excellent documentation at http://sharpplot.com. The documentation is actually written for C# users, but there is an illustration of how to translate the documentation to APL (and VB) at http://www.sharpplot.com/Languages.htm.

Sections 2 and 10 are the most interesting ones and are as follows. The full set is too lengthy (and too boring?) to be included here. Some students may be able to make up their own exercises from the list of topics. As well, it may be useful to know what the “experts” think are important aspects of APL.

If the function written above is multiply recursive, write a version which is singly recursive (only one occurrence of ∇). Use cmpx to compare the performance of the two functions on the argument 35.

2.2 Peano Addition

Let ⍺ and ⍵ be natural numbers (non-negative integers). Write a function padd to compute ⍺+⍵ without using + itself (or - or × or ÷ …). The only functions allowed are comparison to 0 and the functions pre←{⍵-1} and suc←{⍵+1}.

2.3 Peano Multiplication

Let ⍺ and ⍵ be natural numbers (non-negative integers). Write a function ptimes to compute ⍺×⍵ without using × itself (or + or - or ÷ …). The only functions allowed are comparison to 0 and the functions pre←{⍵-1} and suc←{⍵+1} and padd.

Write a function for the extend H algorithm for ⍵, ⍵≥0, which embeds the complete binary tree of depth ⍵ on the plane. In ×H ⍵ there are 2*⍵ leaves, instances of the letter o with exactly one neighbor (or no neighbors for 0=⍵). The root is at the center of the matrix.

The Tower of Hanoi problem is to move a set of ⍵ different-sized disks from one peg to another, moving one disk at a time, using an intermediate peg if necessary. At all times no larger disk may sit on top of a smaller disk.

Write a function Hanoi ⍵ to solve the problem of moving ⍵ disks from peg 0 to peg 2. Since it’s always the disk which is at the top of a peg which is moved, the solution can be stated as a 2-column matrix with column 0 indicating the source peg and column 1 the destination peg.

I was re-reading A Mathematician’s Apology before recommending it to Suki Tekverk, our summer intern, and came across a statement that the number of primes less than 1e9 is 50847478 (§14, page 23). The function pco from the dfns workspace does computations on primes; ¯1 pco n is the number of primes less than n:

)copy dfns pco
¯1 pco 1e9
50847534

The two numbers 50847478 and 50847534 cannot both be correct. A search of 50847478 on the internet reveals that it is incorrect and that 50847534 is correct. 50847478 is erroneously cited in various textbooks and even has a name, Bertelsen’s number, memorably described by MathWorld as “an erroneous name erroneously given the erroneous value of π(1e9) = 50847478.”

Although several internet sources give 50847534 as the number of primes less than 1e9, they don’t provide a proof. Relying on them would be doing the same thing — arguing from authority — that led other authors astray, even if it is the authority of the mighty internet. Besides, I’d already given Suki, an aspiring mathematician, the standard spiel about the importance of proof.

How do you prove the 50847534 number? One way is to prove correct a program that produces it. Proving pco correct seems daunting — it has 103 lines and two large tables. Therefore, I wrote a function from scratch to compute the number of primes less than ⍵, a function written in a way that facilitates proof.

sieve ⍵ produces a boolean vector b with length ⍵ such that b/⍳⍴b are all the primes less than ⍵. It implements the sieve of Eratosthenes: mark as not-prime multiples of each prime less than ⍵*0.5; this latter list of primes obtains by applying the function recursively to ⌈⍵*0.5. Some obvious optimizations are implemented:

Multiples of 2 3 5 are marked by initializing b with ⍵⍴{∧⌿↑(×/⍵)⍴¨~⍵↑¨1}2 3 5 rather than with ⍵⍴1.

There are other functions which are much easier to prove correct, for example, sieve1←{2=+⌿0=(⍳3⌈⍵)∘.|⍳⍵}. However, sieve1 requires O(⍵*2) space and sieve1 1e9 cannot produce a result with current technology. (Hmm, a provably correct program that cannot produce the desired result …)

We can test that sieve is consistent with pco and that pco is self-consistent. pco is a model of the p: primitive function in J. Its cases are: pco n the n-th prime¯1 pco n the number of primes less than n 1 pco n 1 iff n is prime 2 pco n the prime factors and exponents of n 3 pco n the prime factorization of n¯4 pco n the next prime < n 4 pco n the next prime > n10 pco r,s a boolean vector b such that r+b/⍳⍴b are the primes in the half-open interval [r,s).

Background: Shuffle QA is an intensification of the normal QA. The suite of over 1800 QA functions is run under shuffle, whereby every getspace (memory allocation) is followed by every APL array in the workspace being shifted (“shuffled”) and by a simple workspace integrity check. The purpose is to uncover memory reads/writes which are rendered unsafe by events which in normal operations are rare.

m9249

m9249 tests +\[a], ×\[a], ⌈\[a], and ⌊\[a]. It ran in under 3 seconds in normal operations; for it to take more than 4½ days under shuffle, getspace is clearly the dominant resource and hereafter we focus on that.

m9249 used expressions like:

assert (+⍀x) ≡ {⍺+⍵}⍀x

Since {⍺+⍵} is not recognized by the scan operator ⍀ and therefore not supported by special code, {⍺+⍵}⍀x is done by the general O(n*2) algorithm for scans, AND on each scalar the operand {⍺+⍵} is invoked and a getspace is done. The improved, faster QA does this:

F ← {1≥≢⍵:⍵ ⋄ x ⍪ (¯1↑⍵) ⍺⍺⍨ ¯1↑x←⍺⍺ ∇∇ ¯1↓⍵}
assert (+⍀x) ≡ +F x

With such a change, the faster m9249 runs in 22 minutes under shuffle instead of over 4½ days, without any reduction in coverage.

The number of getspace are as follows: +⍀x takes O(1). The second expression {⍺+⍵}⍀x does ×\1↓⍴x separate scans, and each one takes O((≢x)*2) getspace; the total is therefore O(((≢x)*2)××/1↓⍴x) ←→ O((≢x)××/⍴x). The third expression +F is linear in ≢x and is therefore O(≢x) in the number of getspace. In m9249 the largest x has size 33 5 7. The following table shows the orders of complexity and what they imply for this largest x:

+⍀x

O(1)

1

{⍺+⍵}⍀x

O((≢x)××/⍴x)

38115 = 33 × ×/ 33 5 7

+F x

O(≢x)

33

The ratio between 38115 and 33 goes a long way towards explaining why the time to run m9249 under shuffle was over 4½ days and is now 22 minutes. (Why does m9249 still take 22 minutes? It tests for the four functions + × ⌈ ⌊, for various axes, for different datatypes, and for different axis lengths.)

rankop

Another shuffle sluggard rankop took 47 hours. rankop tests expressions of the form f⍤r⊢yand x f⍤r⊢y for various functions f. The techniques used for reducing the number of getspace differ from function to function. We look at x↑⍤r⊢y as an illustration.

assert (4↑⍤1⊢y) ≡ 4{⍺↑⍵}⍤1⊢y

⍴y was 2 7 1 8 2 8. The expression took 7.6e¯5 seconds normally and 7.5 seconds under shuffle. The left hand side requires O(1) getspace, the right hand side O(×/¯1↓⍴x). The expression was changed to:

assert (4↑⍤1⊢y) ≡ 2 7 1 8 2 4↑y

Feels like cheating, doesn’t it? The new expression takes 0.115 seconds under shuffle.

m12466

m12466 tests n⌈/[a]x (and n⌊/[a]x). Previously, it did this by comparing n⌈/[a]x against n{⍺⌈⍵}/[a]x for x with shape 7 17 13 11, for each of the four axes, for two different values of n, and for the 1-, 2-, 4-, and 8-byte numeric datatypes. It required 5238426 getspace and 26.3 hours to run under shuffle.

F←{p⍉⍺⍺⌿(⊂(⎕io-⍨⍳|⍺)∘.+⍳(1-|⍺)+(⍴⍵)[⍵⍵])⌷(⍋p←⍒⍵⍵=⍳⍴⍴⍵)⍉⍵}

F is a dyadic operator. The left operand ⍺⍺ is ⌈ or ⌊; the right operand ⍵⍵ is the axis; and n(⌈ F a)x is the same as n⌈/[a]x. n(⌈ F a)x mainly does n⌈⌿x; other axes are handled by transposing the axis of interest to be the first, then doing n⌈⌿x, then transposing the axes back into the required order. n(⌈ F a)x is much more complicated than n{⍺⌈⍵}/[a]x but has the advantage of using only O(1) getspace.

The revamped m12466 function uses 13717 getspace and 2 minutes to run under shuffle.

points

points is a function to test monadic format (⍕) for different values of ⎕pp.

a “round-trip” num≡⍕⍎num is evaluated. The expression requires 1.3 million getspace and 7.2 hours to run under shuffle.

A much more efficient version with respect to getspace is possible. Let x be a vector. ⍕¨x requires O(≢x) getspace; ⍕x requires O(1) getspace and, like ⍕¨x, formats each individual element of x independently.

14 15 16 17 points1 100 requires 11050 getspace and less than 2 minutes to run under shuffle.

What to do?

Andy says that the shuffle QA takes over 2000 hours (!). The shuffle sluggards will be tackled as done here by using more parsimonious APL expressions to compare against the aspect being QA-ed. Going forward, a new QA function will be run under shuffle as it is being developed. The developers will be unlikely to tolerate a running time of 4½ days.