From its first iteration, all components of the image processing package were templated by the color type. This is not a conventional way to implement graphics libraries – most libraries abstract away the exact color type the image uses behind an OOP interface, or they simply convert all images to a single in-memory pixel format. However, for most cases, this is wasteful and inefficient – usually, the programmer already knows the exact format that an image will be in, with notable exceptions being applications in which the image data comes from user input (e.g. image editors). Instead, this library declares all image types as templates, with the type indicating the image’s color being a template parameter.

I’m rather pleased with the result of this overhaul, so I’d like to share some highlights in this article.

This method of defining static interfaces is similar to the one used by D’s std.range, e.g. for isInputRange. Instead of defining an interface close to the OOP sense, D static interfaces conventionally are defined by testing if they implement certain features. This is done by checking if operations that the type is expected to implement compile with no errors, or evaluate to a certain type. Usually the IsExpression is used for this, or alternatively the compiles trait.

Similar to std.range.ElementType, we define a template to extract the type a view uses to indicate a pixel’s colors:

This eponymous template uses std.functional.binaryFun to accept a predicate in the form of either a string expression (which will be mixed in), or a delegate literal (lambda). As the function has an auto return type and returns a struct declared within the function, Procedural is an example of Voldemort types.

Note how the color type of the returned view is inferred from the type of the c parameter, so e.g. solid(RGB(1, 2, 3), 10, 10) will return a view with RGB pixels, even though it has no fully-qualified name.

Another thing we can do with this model is create views that transform other views in some way or another. We define another template mixin for the common code:

Note the line static if (isWritableView!V). It indicates that the view[x, y] = c operation should only be defined if the underlying view src supports it. This way, the warped view will only be writable if the underlying view is.

With this in hand, we can implement a cropping view, which presents a rectangular portion of another view:

The if (isView!V)template constraint verifies that the first argument satisfies the conditions of the isView interface.

As above, crop uses isDirectView to provide direct pixel access if the underlying image supports it. Direct pixel access is useful when working with pixels in bulk will result in greater performance than accessing each pixel in turn. For example, when blitting one image to another, it is much faster to use array slice copies (D’s type-safe equivalent of memcpy) than assigning each pixel individually:

The same idea as crop can be used to implement a view that tiles another, or does nearest-neighbor scaling. (More complicated scaling algorithms are better implemented in an imperative style.) The code is similar to crop, so I have not included it here.

Even though crop takes the source as a regular argument, the intended usage of the function and its siblings is as if it was a method of the source view: someView.nearestNeighbor(100, 100).tile(1000, 1000).crop(50, 50, 950, 950). This capability is provided by a language feature called “Uniform Function Call Syntax” (UFCS for short), which allows writing a.fun(b...) instead of fun(a, b...). Its biggest benefit is that it allows chaining (a.fun1().fun2().fun3() instead of fun3(fun2(fun1(a)))), which both Phobos and this package take advantage of.

For simple transformations which do not change the view size, we can define a helper function that simply applies a user-specified formula to each pixel’s coordinates:

warp uses a tricky method of inspecting the user-supplied formulas. The function testWarpY is declared as a template, yet with zero template arguments – this will cause the compiler to not perform semantic analysis on its body until it is instantiated. And since the function does not have an x symbol in its scope, it can only be instantiated successfully if the yExpr expression does not refer to x. The __traits(compiles, testWarpY()) static expression checks for just that. This allows us to define the direct view scanline primitive only if we can be sure that we can do so safely. Example:

The q{...} syntax is just a fancy way to declare a string literal. This syntax is conventionally used to contain D code, which is usually later mixin’d somewhere. The expression has access to all the symbols at the mixin site – in our case, the warp and testWarpY methods of the Warped struct.

Since vflip satisfies the first two conditions needed to declare the scanline method, someView.vflip() will be a direct view if someView is. This was achieved without special-casing the vflip declaration.

Because the abstraction we use does not rely on runtime polymorphism, the compiler is free to inline calls across all transformation layers. Flipping an image horizontally twice is a no-op – and, indeed, i[5, 5] and i.hflip().hflip()[5, 5] produce identical machine code. D compilers with more advanced backends can perform more advanced optimizations: for example, if we define a flipXY function which flips the X and Y axes of a view, and a rotateCW function (to rotate an image 90° clockwise) as src.flipXY().hflip(), then four successive rotateCW calls (a full 360°) get optimized away to nothing.

Let’s move on to performing operations on the pixels themselves. std.algorithm’s flagship function is map, which returns a range which lazily applies an expression over another range. Our colorMap applies the idea to colors:

With colorMap, declaring a function which inverts an image’s colors is as simple as:

alias invert = colorMap!q{~c};

colorMap does not require that the input and output colors have the same type. This allows using it for color type conversion: read("image.bmp").parseBMP!RGB().colorMap!(c => BGRX(c.b, c.g, c.r)) will present an RGB bitmap as a BGRX view.

This program draws the initial image in higher resolution, using 16-bit luminance, which is later converted to 8-bit sRGB after downscaling. The downscaling from a higher resolution is done to avoid aliasing, and the gamma conversion is required for accurate resizing.

This program’s use of the recurrence, take, and countUntil D range primitives, as well as D’s native support of complex numbers, allows a much terser implementation of an algorithm usually requiring a dozen lines to implement. (Built-in complex number support is in the process of being deprecated in favor of std.complex, though.)

Output:

The templated approach promises great performance advantages. As a simple benchmark, this program downscales a directory of images by 25%:

The D program runs about 4-5 times quicker. Of course, it’s not a fair comparison: even though both use 16-bit color depth, gamma correction, multithreading, and target the same CPU architecture, the D program consists of code specifically optimized for this task. Short of some sort of JIT, this cannot be matched by generic image-processing libraries.

The graphics package is available on GitHub. Thanks to David Ellsworth for his input to this article.

Post navigation

8 thoughts on “Functional image processing in D”

This was a very fun and interesting read. I’ll leave a comment on here because I know what it’s like to offer a decent comment system and then have nobody use it.

Do you intend to add parseJPG and toJPG functions? Having these functions would nicely round off the library into making it very useful for image processing for say, a website. I have a few websites I work on where conversion to and from .GIF, .PNG, and .JPG happens a lot through the Python library PIL. It would be nice to see this implemented in D in a memory-efficient manner. It looks like you are most of the way there already.

Image formats such as JPG or PNG are not trivial to implement, and I’m not sure much can be gained from reimplementing them in D. I’ll most likely add modules to the package which wrap existing image libraries, e.g. libpng or libjpeg. There is one such module in the package already, for SDL_image.

This was very interesting from a library design standpoint, my understanding of D is very basic and outdated, I dind’t even know that you could have intefaces work like that. It reminds me of Golang philosofy of dealing with intefaces and implementations.
Overall a very good read, thanks!

I’m curious which plugin you’re using for D code snippets. I’ve tried some on my D blog in the past with disappointing results (and not enough regexfu to do anything about it). What you have here looks good.