2008-08-31

Nostalgia for times past sometimes leads to frustration and anger. Yes, times are harder for would-be pixel artists, but nothing a little work can't solve. Let's do fractals in OCaml!

Suppose you've already installed the wonderful findlib (vielen Dank, Herr Stolpmann!). Suppose, furthermore, you've succeded in compilingOcamlSDL. This might look like a heavy prerequisite to fulfill before showing pretty graphics, but it's a job you have to do only once. Then, something like plotting Newton fractals requires less than ninety lines of code, and the result is this:

I find natural to abstract iteration patterns into higher-order functions. In this case, centering and scaling the pixel plane over the complex plane (the so-called screen-space transformation) is tedious but repetitive code that could be reused. Furthermore, the details of striding, color conversion and such are the same from program to program, where the only thing that varies is the actual calculation performed on each point:

The magnification mag specifies how many real units the shortest dimension spans. The outer loop iterates with i for every scanline, whose base address into the pixels array is b. The inner loop iterates with j the index to successive pixels, and from both i and j the coordinates x, y are computed. With those the function f is called, and the result c (an RGB triple of type Sdlvideo.color) is mapped and assigned to the pixel. At the end of every scanline, the function g is called so that the main program can display progress.

The simplest SDL event loop waits for the user to quit the application:

Anything more complicated is easy to accomodate. Now, there are a myriad schemes for color rendering of mathematical data, but using the color wheel is very common, especially since the complex argument maps naturally to a hue value. Converting HSV colors to RGB is somewhat involved, but the following code is entirely reusable:

quomod scales the fractional part of x by the integer d, and splits the result into an entier part e and a fractional part f. This clasifies the hue into a sextant on which the fractional part linearly interpolates. To map a real number in the interval [0… 1] into [0… 256), byte_of_frac scales the fraction by 28 and clips the result to a byte by constructing the appropriate bit-mask.

The code so far doesn't incorporate any specific functionality, and could be shipped off to a standard module to use again and again. The particular application that I had in mind, Newton fractals, are the plot on the complex plane of the basins of attraction to the complex roots of a polynomial, as found by applying the Newton method.

Polynomials and their derivatives are especially easy to evaluate at a point in one pass using Horner's method, when represented as the list of their coefficients with the independent term at the head, a so-called dense representation of polynomials:

The root finder returns the number of iterations required, up to a maximum and/or specified precision, together with the closest estimate to the root. This allows me to "color" the fractal according to the time needed to converge to a root. With the chosen representation, the polynomial zn - 1 (which represents, by definition, the n-th roots of unity) is straightforward to construct:

The iteration limit max_iter controls the fractal detail. I first initialize the SDL library and create a big enough window, with a descriptive title. Then, I build a polynomial p representing the third roots of unity, and invoke iter_surface with a function that finds p's root starting from the complex value corresponding to the coordinates of the point to be plotted. I convert the complex angle of the returned approximation z to a hue value (by dividing by 2π and normalizing to the range [0… 1]); analogously, the number of iterations determines the vividness, or value of the color. This is converted to an RGB triple and returned. Every eight scanlines the screen is refreshed with the contents of the backing buffer. Once every point is plotted, the event loop is entered until the user decides to quit.

So, not counting the generic, reusable code, less than 40 lines of code suffice to just draw something on the screen. Have fun!

2008-08-17

When you thought that your obsession had run its course and left you in peace, finally, along comes something new and shiny to once more lead you astray. I've written half a dozen posts about sorting networks and genetic algorithms; I thought I had put that topic to rest. What am I to do when I find the Tilings Encyclopedia later the same week? How could I resist this aperiodic fest of self-similar tilings?

The nice thing about the Encyclopedia is not that the tilings are extensively cross-referenced and annotated with the relevant bibliographies; that's all very useful, I don't doubt, for its intended audience; but what I immediately liked was that the tilings are presented as rewriting rules. That should make it easy to replicate them, I thought; I wasn't very wrong, but I find that a functional program is not finished until its structure is strong and regular enough to let it stand on its own legs. After refactoring and restructuring and rewriting the code for a week or so, I had a robust, general program that let me add new tilings, write the tilings in different formats and so on; my last post deals with the details of the back-end mechanism.

Look at the page for the Amman-Beenker rhomb triangle tiling, for instance (go ahead, I'll wait for you). The tiling uses three primitives: two oriented triangles and a parallelogram. Each is "split" or rewritten into a configuration of the same tiles. The process by which the plane is tiled by rewriting is called inflation; the rate of expansion of a primitive tile into a configuration of tiles at each step is called the inflation coefficient.

In typical OCaml fashion, the tyranny of early bound top-level definitions means that I have to sweat the small details first, so that the intended objective is reached in a grand finale by a simple function that synthesizes all the bottom-up development. That's is at odds with writing an engaging literate program, I'm afraid, but it has to be done. The first ingredient, then, in my recipe for pretty pictures is something that lets me put the pieces where they should go. A little bit of linear algebra, in short.

I'll use the most direct representation possible of affine transformations, an affine matrixT such that a vector v gets mapped to v⋅T. Only, I'll be sloppy and identify points on the plane (x, y) ∈ R2 with homogeneous vectors (x, y, 1) ∈ A2. I always use row vectors and multiply them on the left, which means that the last column of the matrix T is fixed to (0, 0, 1)T, so I don't need to represent it. To simplify things, I flatten the matrix in column-major order:

Note that I use homogeneous scaling. The composition of two transforms is simply the product of the corresponding matrices. I denote the composition with the % operator, but perhaps a better choice would have been an infix symbol that highlighted the non-commutative nature of transformations:

The type t abstracts away the tiles used. name is the full name of the substitution, for purposes of meaningfully labeling the output. shape t returns the polygon corresponding to the tile t as a list of points, and color t returns an RGB triple for that tile. seed is a list of placed tiles with which the inflation process will start. Finally, the meat of the tiling is encoded in rules, that for each tile t returns the expansion as a list of placed tiles. By convention, I'll scale the resulting expansion so that the transformations are local to each tile; in other words, I'll avoid the need to keep track of the overall scale. With that, the core algorithm is captured in the following function:

This rather generic function repeatedly expands a tiling l by applying n times the rules f. Interpreted under the list monad, iterate is nothing more than functional iteration of the monadic function f; but clearly bringing the entire machinery of the non-deterministic monad for just this is overkill.

With this, I have all the necessary pieces to actually implement the drawing algorithm. In keeping with the theme of the previous post, I'll output EPS code to a temporary file which I'll then open with Preview. The Generator is parameterized by a TILING:

module Generator (S : TILING) = struct

But, there's a snag to solve: as I've set up things, the final size of the tiling will be that of the seed, and I use, in general, shapes with sides of length 1. This will make the final tiling the size of a pixel! Precisely this is the drive behind using "abstract" back-ends: I can draw the tiling once to get its "real" size, rescale it, and then draw it for real:

Remember that a GRAP back-end contains the monad M that implements the back-end. Drawing a tiling involves taking each placed tile (t, x), transforming the polygon S.shape t by the placement transform x, and drawing it. Again, this is completely generic on the "drawing" method. I take advantage of this genericity first to find the bounding box of the tiling:

This function has a rather arbitrary design decision: whether to scale to the bounding box of the seed, or to that of the final tiling. If the expansion were confined to the boundary of the parent tile, it would make no difference (except for the work involved!); not all tilings, however, are so tidy. One of the most famous aperiodic tilings, Penrose's, overstep the boundaries of the parent tile.

There's not much to say about this code, except that it encodes precisely the data presented in its Encyclopedia page. Let me just add that I found encoding the transformations quite error-prone; but then again I'm not a visual thinker. The result is:

This tile could make excellent backgrounds; note that the weaving of triangles and parallelograms creates a somewhat disturbing optical illusion. Another beautiful tiling is the Robinson triangle tiling:

As you can see, the tiles overflow their parents. I used a "sun" seed to generate a pentagonally-symmetrical result:

let () =
let module G = Generator(PenroseRhomb) in
G.generate 256. 4

I've uploaded the complete program, with the code in the last post and in this, to the OCamlCore Forge. There are a number of really nice tilings I'll maybe some day implement noted in comments at the end of the file. Or, find the ones you like and add a snippet to the forge. Have fun!

2008-08-15

As Haskell gains popularity, a dozen articles or more a month about them make the front page at the Programming reddit. Most deal with the necessity of using and understanding monads to control side-effects in Haskell. This is only natural and to be expected since, in Haskell, they're a language design decision, and as such as inescapable as laziness and as structural as the predefined operator precedence.

My background is in OCaml (gratuitous biographical aside: I found a notebook with OCaml code I wrote in '96), and so I view the IO and attendant monads with some disdain. Let me explain myself: my view is that using them to control side-effects is a rather pedestrian application of monads.

There's another, much higher-level view of them, and that is monads as categorical constructions. Category theory, by its very nature as a mathematical framework, is a language in which to express relations and operations between things. It's the mathematical equivalent of the GoF's patterns. So, reading an article like this one left me with the slightly uneasy feeling of having simultaneously not understood much of it (because I don't know Haskell, and because the article has no context to speak of) and knowing that an isomorphism between the IO monad and anything else cannot allow you to "weave in and out of using IO": freeness demands that the "other" side of the isomorphism contain, encode or fake enough of IO so as to be undistinguishable from it (the type of IO' as presented in the article lent me evidence of this). This is where category theory comes handy: understanding by pattern-matching.

The practical upshot of this is that it enables a third view of monads, Moggi's, as capsules of meaning that are, if not composable, certainly pluggable. First in Notions of computation and monads, then in Monads and Effects (whence IO and friends), Moggi shows that monads are the right tool for the (hard) job of making your code independent of, or at least parameterizable by, semantics. Monads are plug-ins.

In what way are monads plug-ins? Well, a plug-in must have two things: specific behavior that you can invoke to augment your program in unforeseen ways even after you've written it, and a standard interface by which the program can call it. Monads have a rather minimalist interface:

First comes a type, α m: this is the "protocol" by which your program and the plug-in communicate without one having to "know too much" (or more than the standardized interface allows) about the other. It lets the monad say, in effect: "for me to be arbitrary but compatible, you'll have to accept that I do things my way". The monad gets to have its "private space" in which to do its job, without fear that the program will mess up its stuff.

Then comes the method return, that the program uses to pass values down to the monad. It lets the code say, "here, have this value for your own use". Finally, there's the operation bind that lets the program and the monad maintain a conversation. With it, the program can say to the monad "oh, when you are done with anything, here's another thing I need you to do", to which the monad responds "OK, I'll take that and give you back the result". This "wiring" aspect to the bind is emphasized in it being usually called >>=, an infix operator.

This is all nice, operational and anthropomorphizing, but hardly illuminating. The fact that the monad functions as a protocol is formalized in the monad laws which every concrete, "plug-in" monad must abide to:

bind (return a) k ≡ ka

bind m return ≡ m

bind (bind mk) h ≡ bind m (fun x -> bind (kx) h)

The first law states that return on the left of bind does nothing. The second law states that return on the right of bind does nothing. The third law means that a bind of binds associates freely. But that's exactly what a monoid is: a "thing" with an associative operation that has a neutral, "do-nothing" element on the left and right of it.

Consider by way of example the concatenation of strings, denoted by ^. There is a special string, "" or the empty string, that "doesn't change" a string when catenated to it: for all strings s, "" ^ s is the same as s, and so is s ^ "". Also, it doesn't matter which string you concatenate before and which after; it's the order in which you concatenate that matters. In symbols, s ^ (t ^ u) is the same as (s ^ t) ^ u.

You might think, what have monoids to do with monads? The answer is that monads "behave" algebraically as monoids. You might think next, aren't monads monoids, then? Well, no. The crucial difference between a monoid and a monad is that, packed inside the return and the bind operations of a monad there is meaning tucked in. It could send output to a console. It could update a memory location. It could try simultaneously different possible computations and return any, or all of them. It could record transactional data. The key is that the monad laws are a contract that allow you, no matter how complicated the "stuff" the monad does underneath, operate with it in a structured, predictable way. Arbitrary behavior under a fixed interface. A plug-in, in short.

If you followed me so far you might have a nagging doubt by now. How do you "escape" the monad, once you use it? That is, return serves to put plain old values in a "wrapper" the monad can use but, how can you get back what the monad did with it? Where's the function with type α m → α? The bad news is that there isn't any… in general. That is, there is no guaranteed way to get back a value from an arbitrary monad. However, many useful monads do provide an escape to "retrieve" the useful work the monad did.

At this point I should show a concrete monad. A classical example (one that I'll eventually need to use) is the state monad. It allows to "plug-in" statefulness to an otherwise "pure" or "stateless" function:

This monad is multi-parametric, in the sense that not only the values your function computes are arbitrary (the α in the monad's type), but the states you might want to track are too. Suppose you want to tally the arithmetic operations a function uses to calculate a value. The state can be as simple as a single integer, or as complex as a tuple recording separate counts for additions and subtractions, multiplications, divisions, square roots and so on. Whichever the case, the key to this monad is that an impure function can be split into a pure part and a state transformer part that takes an input state and returns an output state together with the value of the function. With that, it's not difficult to see that if return is to be a neutral element, it must pass its input state unchanged. In other words, the effect of a value is nil. In yet other words, a value is always pure. Also, bind must keep track of intermediate states between effects, and thread them appropriately.

Immutable state is no state at all but just another value, so this monad comes with a pair of operations get and put that allow your program to retrieve the state at some point of your function, and modify it. Also, after your computation terminates, you should be able to retrieve not only its final value with eval but also to access the final state with run. This variant assumes you'll want one or the other; it is possible to get both at once without having to re-compute the function.

Note that get is a monadic value, not a function. This might not make sense unless you look into the monad and see how it is constructed. For a less operational view, think of get as representing, by itself, whatever the state is at that point in the program. Since it's a monadic value, the only useful thing to do with it is to bind it to a function:

This is all quite tedious to repeat each time I need a monad. Fortunately, most of the operations you can build out of monadic bind and monadic return are structural, and so generic on the particular monad. Enter the Monad functor:

seq allows you to invoke the monad on the left not for its value but for its effect. join lets you "squash down" a monadic monad to the underlying monad; the fact that it is generic and independent of the underlying semantics means that the monad itself, viewed as a categorical functor, is idempotent, that is, it's useless to monadize a monad. fmap "maps" a function inside a monad, which effectively makes a functor out of a monad. liftmn is an indexed family of mappers for functions of one, two… arguments, I've stopped at three but you could go on as you need.

A list of monads can be composed together to turn them "inside out" into a monad of lists. That is what mapm and sequence do, commuting the monad and the list, with or without transforming the list's elements first. Also, a list of monads can be evaluated in sequence purely by their side effects, that is where the mapm_ and sequence_ come in handy (the names of these functions are exactly the same as that of their Haskell counterparts).

Lastly, the two very common operations return and seq have very convenient operators associated with them. With all this, I can write the following:

This monad interleaves formatted writing operations, fmt, as a side effect of whatever the code using it does. What is special of this monad is that the "escape" function, write ships out all the accumulated shows to a named output file, safely closing it even in the event of an exception.

With this, I'm ready to show you how to actually consume plug-ins from your program. Suppose you want to have a simple graphics DSL. I have a program with exactly such a need, and as a MacOS X user, the easiest way I found to satisfy it was to generate EPS and to use /usr/bin/open to handle the data for me. But I didn't want to encapsulate (!) my drawing code inside a PostScript interface, but needed it to be as "abstract" as possible (my original back-end was a Mathematica one). I settled on a simple hybrid:

The module is not parameterized by a monad, but it does include a monad that will provide the "glue" to whatever back-end I want to write. As you can see, it is rather primitive but includes operations for drawing in color. It is worth mentioning that this is not a so-called retained interface, but an immediate one, and so save runs the drawing operations on a "sub-context" that is restored when all the commands are shipped out. Also, the type that carries the drawing operations is an alias for the unit monadic value, as drawing doesn't really have a value. My first implementation was the EPS back-end:

In this case, all drawing is to be output to a file, so the monad that implements the plug-in driver is Out. The code uses fmt extensively to format the PostScript commands. For instance, an EPS file must specify a bounding box against which all drawing will be clipped to:

The monad is used purely for its side-effects, namely writing to a file, so I don't bind, I always seq. After the drawing is complete, I actually want to see it on screen; for that, I write a temporary EPS file that I pass to /usr/bin/open, which in turn opens Preview, which converts the file to PDF and shows it:

Polygons are built out of a sequence of points. The first one is the starting point and must be passed to moveto, the rest are to be arguments of linetos. A mapping would do, except for the fact that the code is in monadic style, and must use mapm_:

Success! A decagon with fat vertices. The only problem with an EPS back-end is that it is difficult to get the optimal width and height for an image. In other words, it would be ideal to have some auto-scaling. For that, I have to know the size of a drawing before turning it into PostScript code. Enter the Bounds back-end:

I keep track of the bounding box of the commands seen by the interface, as a rectangle given by two pairs of coordinates. The initial, empty bounds are such that I can accumulate a point simply by a series of minimum and maximum operations:

This allows me to scale up the drawing to fit the boundaries of the EPS file, or to adjust the size of the EPS file so that no space is wasted. The important thing is that the code that contains the drawing commands is left unchanged. Monads are plug-ins.

2008-08-05

I succeeded in finding a minimal 8-element sorting network with 19 comparators grouped in 6 stages. For that, I needed a change in my fitness evaluation function to reflect an insight I had while preparing the last post. My original fitness evaluator was monotone first on size and then on depth, so that fitness(s, d) < fitness(s′, d′) ⇔ s < s′ ∨ s = s′ ∧ d < d′. It is, I found experimentally, reward more shallower than shorter networks, so that fitness(s, d) < fitness(s′, d′) ⇔ d < d′ ∨ d = d′ ∧ s < s′. For that, note that 0 < d ≤ s, so that 0 < d/s ≤ 1, or 0 ≤ 1 - d/s < 1, and so d + 1 - d/s is monotone as required. To convert a decreasing "penalty" into an increasing fitness, I use the mapping I showed the last time, x → x/(x - 1):

(the entire program, minus comments, is available as a paste in OCaml Forge). This means that, sometimes, in order to decrease depth a larger network would be chosen as the winner of the tournament, but that is, apparently, the key to a successful search. With that, here are some minimal networks of order up to 8:

There are a number of things that can be changed, beyond the command-line parameters. Mind you, these are important, especially the pool size: too large a pool, and evaluation will spend all your patience; too little, and the population gets stuck on a local maximum. Beyond that, it's code intervention time.

The first weak spot is the fitness evaluation function. My original intent was to give absolute priority to network size, and then to network depth; in other words, fitness(s, d) < fitness(s′, d′) ⇔ s < s′ ∨ s = s′ ∧ d < d′. This is restrictive, because a reduction in depth may trigger a further reduction in size. A more graded fitness measure rewards reductions both in size and depth, even if that means that your GA spends a little more optimizing the latter.

The idea is to compute a penalty score comprised of a whole point per size, and a fraction of a size per level, so that a worst-case O(n²) network is not as bad as a great (n + 1)-level sorter, preserving monotonicity. Since in this program, a fitter individual has a greater score, and individuals that sort, I use inversion to change the overhead into a fitness value. As you can see, this value is the nearer to one the larger the network is:

But not enough to reach the target fitness. Some other idea is needed, and it is to actually respect the definition of Microbial Tournament laid out by Inman Harvey. His method matches up individuals a pair at a time, not the entire population at once. This requires more iterations, but each one involves a single mating, so the efficiency of doing pair-at-a-time tournaments is mostly the same. First, I need a way to draw a pair of individuals at random, uniformly form the population. The twist is that I must not pit an individual against itself:

To draw a random disjoint pair 0 ≤ i ≠ j < n with uniform probabilities, it's best to treat the problem as if it were an urn drawing without replacement. In contrast to the generation of ordered pairs, this makes it easy to select both members of such a non-diagonal pair with exactly uniform probabilities: select 0 ≤ i < n with uniform probability 1/n. This leaves n - 1 possibilities for j: draw it from the set 0 ≤ j < n - 1 with uniform probability 1/(n - 1). If j ≥ i, bump j up by one to simulate the "hole" left behind by the drawing of i. With that, the tournament proceeds not in block, but pair by pair:

Success! It takes a while to shave off the last level, but it does. Note how the percent of active individuals (that is, those that actually sort) climb up as the fitness of the best network approaches the target. The downside of this is that the fact that the GA found a network at all is somewhat of a fluke; smaller or larger pools still make finding a minimal network a hit-and-miss proposition. The last level is the hardest, which is an indication that there is a local minimum that is hard to break out of. For instance, for a smaller pool I had to also change the seed in order to find a solution:

The time to hone into a 16-step solution was quicker, but the time needed to reduce the depth took another order of magnitude like before. A smaller yet pool shows an interesting fact: if the entire population is made of proper sorters, it will be difficult to find in the gene pool a variation that would lead to a smaller sorter. Note how the fraction of active networks remains high throughout:

Which indicates that the smaller the pool the larger should be the mutation probability, but this entails a danger of the algorithm devolving into a pure random search. Notwithstanding that, the genetic probabilities crossover_rate and mutation_rate should be run-time-settable parameters. In fact, even the selection of a random seed is harder than it should be: most of the time I want to explore different pool sizes but I'd rather not think of random seeds to give the program.

Now, with all those changes selecting values for the parameters is a random search of itself; however, the value act is a good indicator of the population health, ironically enough:

(this is with a modified version that makes all parameters settable through command line options). It is evident that, with the high churn rate implied by a mutation rate of 3% even with a high rate of copying of 80%, the search proceeds rather quickly by minimizing depth before length. In a sense, it is necessary to intervene heavily in the design of a GA experiment if it is to have even a modest chance of success.

As a belated end to my exploration of sorting networks (1, 2, 3, 4, 5), I experimented with genetic algorithms to find minimal sorting networks for n elements. And when I write experimented, I mean a couple of weeks of tweaking parameters and code exploring ways for my program to run faster, and to find better and larger networks. Before you continue reading, let me warn you that I wasn't entirely successful: this program works fine for small networks (n ≤ 8), but I couldn't extend this approach to larger, more interesting ones. As it is, though, I had plenty of fun.

The problem of finding minimal sorting networks of a given width is both old and hard; Knuth's examples show uncertainty about minimality for sorters as small as for 16 elements. Hugues Juillé shows that Genetic Algorithms can find minimal networks, but he regrettably keeps the how of it to himself. That said, the table he presents is a valuable starting point as it constitutes an objective target for my program to aim towards:

The network size is the über-parameter of the simulation that determines all other tweaks and knobs; since I want to run the program to look for networks of various widths without having to recompile, I make it a reference:

let network_width = ref 7

Part of the difficulty of effectively applying GAs to solving hard combinatorial problems is deciding on a good genomic encoding of the problem's parameters. Such an encoding must be readily amenable to the two basic operations in a GA search, namely crossover and mutation. I settled on representing directly a sorting network as an array of pairs (i, j) with 0 ≤ i ≤ j < network_width, each representing an exchange box between wires i and j. If i = j, the "exchange" box is the null operation; this will allow for encoding sorting networks of actual varying size on a fixed-length genome.

type genome = (int * int) array

I use an array not only for easy picking of the individual "genes" (the exchange boxes), but so that each genome is actually mutable; this simplifies the bookkeeping needed to complete a tournament (of which more, later). Even so, many operations on a genome as a whole are mostly traversals, with some operation applied to each gene; I found it convenient to abstract the traversal as a generic fold:

I had presented this algorithm before in "Picturing Networks"; the gist of it is that bounds keeps tab of the "stage" or depth a given sorter must be scheduled at. For instance, the basic measure of a genome is the number of productive genes it has; in other words, how many exchange boxes are actually attached to a different pair of wires. The other basic metric of a network is its depth. This fold handily lets me find both quantities at once:

Genomes must randomly sample the space of possibilities; a truly (that is, uniformly) random genome must draw from a space of random genes. To construct a random pair 0 ≤ i < j < n with uniform probabilities, the most expedient way is to use rejection:

There is an exact method using the inverse of the cumulative distribution function, but it involves taking square roots, and this method is actually faster (this was an investigation of its own). With that, a random gene is an appropriately parameterized random exchange box:

let new_gene () = random_ordered_pair !network_width

Once a random exchange box is in place in a network, the mutation operator must controllably change it into another box. One alternative is to replace it with a completely different one; a more parsimonious approach is to detach a wire and reattach it at random. Two things must be ensured, though: first, the change must preserve the basic property that the exchange box be a nondecreasing pair of numbers; second, it must be uniform in the space of possible changes so defined. Each wire can be chosen with equal probabilities, but then which of the two terminals gets reattached depends on the placement of the exchange box relative to the selected wire:

My preoccupation with ensuring that all choices are uniform is not just theoretic; any skew in the search has readily observable consequences. I had an error in the choice of random genes which prevented the program to explore further than networks of width 6; once I corrected it, I could test wider networks. With this, there's the problem of selecting a random genome, which is trivial except for the choice of genome length. How many exchange boxes should a genome have? Too small a number, and the network would never sort; too large and it will take forever to minimize and, what's worse, the probability that a crossover would improve the offspring would plummet (in other words, the search space would be just too large). I found a number both reasonable and completely arbitrary:

let genome_length = !network_width * !network_width

Reasonable because it's easy to sort in quadratic time (witness Bubble Sort); arbitrary since it only depends on the network width. The next completely arbitrary but vital choice is in deciding how to compare networks. The fitness of a genome is the sole measure that determines in which direction the search proceeds; even more, the choice of a fitness function is what makes a Genetic Algorithm go boom or bust. In this case, the complication of selecting a good fitness evaluation resides in that I'm trying to minimize two quantities at once: size and depth of the sorting network. There's much literature on multiobjective optimization that I should have read; what's really important is that the combination of individual objectives must be "monotone", or Pareto dominant if it is to succeed; this is a complicated way to say that the fitness must be a total order on objective tuples, in this case pairs of (size, depth):

I aim for known_best_bounds for the given network_width. The nearer the metrics of the genome are to those numbers, the fitter it will be deemed. Note that, for n ≥ 2, target_fitness (n, 1) - target_fitness (n, n) < target_fitness (n-1, n-1) - target_fitness (n, n); hence the fitness function is monotone in size and in depth, in that order, as required. Now, how to determine if a network is fit or not? The least I can expect of a sorting network is for it to actually sort its input. Trying all network_width! (that's a factorial) possible inputs is unfeasible; fortunately, Knuth showed that it suffices to test all 2network_width binary sequences of length network_width. The Zero-One Principle says that a network sorting the latter is a sufficient condition for it sorting the former. What's more, the 0-1 principle implies that a sorting network can be modeled by a binary circuit. Each exchange box is then a binary function with table:

in0

in1

out0

out1

0

0

0

0

0

1

0

1

1

0

0

1

1

1

1

1

That is, f(in0, in1) = (in0 ∧ in1, in0 ∨ in1).

Aside: I find endlessly confusing the fact that the maximum ↑ and minimum ↓ operators are respectively the join ∨ and meet ∧ of the integer lattice, but the arrows go backwards. Who to blame? Who came with the logical connectives?

An exchange box, hence, needs to select bits and apply the corresponding functions according to a mask. To sort a binary word, it suffices to build the masks for each exchange box, and with that apply the sorting function f defined above:

Note that I actually sort in reverse order, that is, according to f′(in0, in1) = (in0 ∨ in1, in0 ∧ in1). This is so that testing for sortedness is now as simple as checking that the result is a block of contiguous ones, from the least significant bit upwards, using the old trick that relies on the fact that, in two's complement, a block of k contiguous 1's represents exactly 2k - 1:

let is_sorted word = word land (word + 1) == 0

But obviously the fitness of a sorting network is not just determined by its ability to sort, but by the number of exchange boxes it employs to do so. To evaluate the first criterion, generate all 2network_width binary words and test that all are correctly sorted. If they aren't, the computed fitness is, rather arbitrarily, a fraction based on the first sequence the network didn't sort. This is so that the fitness is a total order on the space of genomes:

And now for another arbitrary parameter, namely the size of the population, or rather gene pool. As with the genome size, it is a function of the network_width to be minimized:

let pool_size = ref (100 * !network_width * !network_width)

I've experimented with a pool size cubic in the network_width, but I settled on a quadratic dependence, appropriately scaled. This is too big for small widths, but appropriate for larger ones. In any case, I found it convenient to try different values for it from run to run, so it is a reference. A a population is an array of genomes of this size:

Now would the knobs to tweak end? I'm almost there; but these are fundamental to the performance of the program, and can be the focus of endless tweaking. The GA probabilities:

let crossover_rate = 0.6
and mutation_rate = 0.01

determine the rate of application of the GA operators to the genomes in what's called a tournament, the Darwinian show-down that makes the core of GAs. I used the so-called microbial tournament as it is really simple and easy to implement. The losing genome will have some of its genes replaced with the corresponding ones from the winner, and sometimes mutated according to these numbers:

In a microbial tournament between two individuals i and j in the population p, both are evaluated and deemed a winner and loser. The latter is never the best individual in the population, by definition. Genes from the winner are copied to the loser's genome, according to crossover_rate; then the loser genes are mutated according to mutation_rate. The key in this type of tournaments is that the loser individual is replaced by the offspring resulting from this process, which is evaluated, and if found better than the best so far, promoted.

Now this process is repeated for pairs of individuals drawn at random from the population. It is equivalent, but faster, to partition the pool into halves, perform a random pairing between both, and do a tournament between each pair:

Now I apply the full_microbial_tournament iteratively until the best individual in the population reaches the target fitness. Whenever there's a fitness bump the individual promoted is printed out, together with some statistics. In the event of a console break, the loop is terminated cleanly. In any case, the result of the function is the number of iterations spent in the loop:

A driver function selects the target fitness given by the best known bounds for the chosen network_width, prints out some starting information and builds an initial population and runs the iterative tournament with it:

About Me

Long-time software developer and architect by day, specializing in boring-but-crucial bits of interconnect between disparate architectures. OCaml is my language of choice, but I feel extremely comfortable with C, Java and SQL whenever the need arises.

By night, I'm a calligrapher and a seeker of the hidden light of understanding in the noise of daily living.