Menu 1

Tag Archives: C++

Warning: This post contains personal thoughts about my research work in image processing. I’ll discuss about some of the issues I’m facing as an active developer of two open-source image processing frameworks (namely CImg and G’MIC). So keep in mind this will be a bit self-centered. There are high chances you find all this really boring if you are not a developer of image processing software yourself (and even if so). Anyhow, feel free to give your impressions after the reading!

1. Context and issues

In imaging science, image processing is processing of images using mathematical operations by using any form of signal processing for which the input is an image, such as a photograph or video frame.

That’s what Wikipedia says about image processing. Selecting and ordering those mathematical operations is what actually defines algorithms, and implementing ready-to-use and interesting image processing algorithms is actually one of my goals, as well as making them available for interested users afterwards.

After all those years (>10) as a researcher in the image processing field (and like most of my colleagues), I can say I’ve already implemented a lot of these different algorithms. Mostly in C++ as far as I’m concerned. To be more precise, an important part of my work is even to design (and hopefully publish) my own image processing methods. Most of the time of course, my trials end up with clunky, ineffective and slow operators which give nothing interesting else than knowing the approach is not good enough to be followed. Someone who says everything he tries works right the first time is a liar. Step by step, I try to refine/optimize my prototypes or sometimes even take a completely different direction. Quickly, you realize that it is crucial in this job not to waste time when doing algorithm prototyping because the rate of success is in fact very low.

That’s actually one of the reason why I’ve started the G’MIC project. It was primarily designed as a helper to create and run custom image processing pipelines quickly (from the shell, basically). It saves me time, everyday. But the world of image processing algorithms is broad and sometimes you need to experiment with very low-level routines working at a pixel scale, trying such weird and unexpected stuffs that none of the “usual” image processing algorithms you already have in your toolbox can be of use as it is. Or it is used in a so diverted way that it gets hard to even think about using it adequately. In a word, your pixel-level algorithm won’t be expressed as a simple pipeline (or graph if you want to call it so) of macro-scale image processing operators. That’s the case for instance with most of the well known patch-based image processing algorithms (e.g. Non-Local-Means, or PatchMatch and plenty of variants), where each pixel value of the resulting image is computed from (a lot of) other pixel values whose spatial locations are sometimes not evenly distributed (but not random as well!).

Until now, when I was trying to implement this kind of algorithms, I was resigned to go back coding them in C++: It is one language I feel comfortable with, and I’m sure it will run fast enough most of the time. Indeed, computation time is often a bottleneck in image processing. Some of my colleagues are using scripting languages as Matlab or Python for algorithm prototyping. But they often need some tricks to avoid writing explicit code loops, or need to write at least some fast C/C++ modules that will be compiled and run from those higher-level interfaces, to ensure they get something fast enough (even for prototyping, I’m definitely not talking about optimized production code here!).

But, I’m not really satisfied with my C++ solution: Generally, I end up with several small pieces of C++ sources I need to compile and maintain. I can hardly re-use them in a bigger pipeline, or redistribute them as clean packages without a lot of extra work. Because they are just intended to be prototypes: They often have only basic command-line interfaces and thus cannot be directly integrated into bigger and user-friendly image processing frameworks. Making a prototype algorithm really usable by others requires at least to wrap it as a plug-in/module for [..copy the name of your favorite image processing tool or language here..]. This generally represents a lot of boring coding work, that may even require more time and efforts than writing the algorithm itself! And I don’t talk about maintenance. If you’ve ever tried to maintain a 10-year old C++ prototype code, lost in one of your sub-sub-sub-sub folder in your $HOME, you know what I mean. I’d definitely prefer a simpler solution that let me spend more time on writing the algorithm itself than packaging it or making it usable. After all, the primary purpose of my work is to create cool algorithms, not really coding user interfaces for them. On the other way, I am a scientist and I’m also happy to share my discoveries with users (and possibly get feedback from them!). How to make those prototyped algorithms finally usable without spending too much time on making them usable ? 🙂

Ideally, I’d like something that could nicely integrate into G’MIC (my favorite framework for doing image processing stuffs, of course 🙂 ). Even if at the end, those algorithms run a bit slower than they are in C++. One could suggest to make them as Octave or Scilab scripts/modules. But I’m the developer of G’MIC, so of course, I’d prefer a solution that help to extend my own project.

So finally, how could I code prototypes for new algorithms working at a pixel-level and make them readily available in G’MIC ? This question has worried me for a long time.

2. Algorithm code viewed as a complex math expression

This command fills each pixel of a given image with the value evaluated from a “mathematical expression”. A mathematical expression being a quite vague concept, it appears you can already write some complex formulas. For instance, typing this on the command line:

Fig.1. One synthetic color image obtained by one application of the G’MIC command -fill.

Of course, the specified expression can refer to pixels of an existing input image. And so, it can modify the pixels of an image as well, as in the following example:

$ gmic leno.png -fill "(abs(i(x+1,y)-i(x-1,y)))^0.25"

which computes gamma-corrected differences of neighboring pixels along the X-axis, as shown below:

Fig.2.1. Original image leno.png

Fig.2.2. Result of the -fill command described above.

(As an aside, let me tell you I’ve recently received e-mails and messages from people who claim that using the image of our beloved Lena to illustrate an article or a blog post is “sexist” (someone even used the term “pornographic”…). I invite you reading the Lena story page if you don’t know why we commonly use this image. As I don’t want to hurt the over-sensibility of these people, I’ll be using a slight variation I’ve made by mixing a photograph of the blessed Jay Leno with the usual image of Lena. Let me call this the Leno image and everyone will be happy (but seriously, get a life!)).

So, as you can imagine, the command -fill already allows me to do a lof of complex and exotic things on images at a pixel level. Technically speaking, it uses the embedded math parser I’ve written for the CImg Library, a C++ open-source image processing library I’ve been developing since 1999 (and on which G’MIC is based). This math parser is quite small (around 1500 lines of C++ code) and quite fast as well, when it is applied on a whole image. That’s mainly because:

1. It uses parallelization (thanks to the use of OpenMP directives) to evaluate expressions on blocs of image pixels in a multi-threaded way.

2. Before being evaluated, the given math expression is pre-compiled on-the-fly by CImg, into a sequence of bytecodes. Then, the evaluation procedure (which is done for the whole image, pixel by pixel) requires only that bytecode sequence to be interpreted, which is way faster than parsing the input mathematical expression itself.

Anyway, I thought the complexity of the pixel-level algorithms I’d like to implement was really higher than just the evaluation of a mathematical formula. But wait… What is missing actually ? Not much more than loops and updatable variables… I already had variables (though non-updatable) and conditionals. Only loops were really missing. That looks like something I could try adding during my summer holidays, isn’t it ? 😉 So, that is where my efforts were focused on during these last weeks: I’ve added new functions to the CImg math parser that allow users to write their own loops in mathematical expressions, namely the functions dowhile(expr,_cond), whiledo(cond,expr) and for(init,cond,expr_end,_expr_body). Of course, it made me also review and re-implement large parts of the math parser code, and I took this opportunity to optimize the whole thing. A new version of the math parser has been made available for the release of G’MIC 1.6.5.2 at the end of August. I’m still working on this expression evaluator in CImg and new improvements and optimizations are ready for the upcoming version 1.6.6.0 of G’MIC(soon to be released).

3. A toy example: Julia fractals

So, what can we do now with these new math features in G’MIC ? Let me illustrate this with a toy example. The following custom G’MIC command renders a Julia fractal. To test it, you just have to copy/paste the following lines in a regular text file user.gmic:

and invoke the new command -juliar_expr it defines by typing this in the terminal:

$ gmic user.gmic -julia_expr

Then, you’ll get this 1024×1024 color image:

Fig.3. Rendering of a Julia fractal only by filling an image with a complex math expression, containing an iteration loop.

As you see, this custom user command -julia_expr is very short and is mainly based on the invokation of the -fill command of G’MIC. But the coolest thing of all happens when we look at the rendering time of that function. The timing measure has been performed on an ASUS laptop with a dual-core HT i7 2Ghz. This is what I get:

Edit : This post has been edited, on 09/20/2015 to reflect the new timings due to math parser optimization done after the initial post of this article.

Less than 0.7 second to fill a 1024×1024 image where each of the 1,048,576 pixels may require up to 256 iterations of a computation loop ? Definitely not bad for a prototyped code written in 5 minutes and which does not require compilation to run! Note that all my CPUs have been active during the computation. Trying the same G’MIC code on my machine at work (a powerful 4x 3-core HT Xeon 2.6Ghz) makes the same render in 0.176 second only!

But of course, one could say:

Why not using the native G’MIC command -mandelbrot instead (here, native means hard-coded as a C++ function) ? It is probably way faster!

That’s 12 times faster, than the previous command -julia_expr run on my laptop, indeed! A bit reassuring to know that C++ compiled into assembly code is faster than CImg home-made bytecode compiled on the fly 😉

But the point is: Suppose now I want to slightly modify the rendering of the fractal, i.e. I don’t want to display the maximum iteration anymore for each pixel (variable iter), but the latest value of the variable zi before the divergence test occurs. Look how simple this is to create a slightly modified command -julia_expr2 that does exactly what I want to do. I have the full control on what the function does at a pixel level:

and this modified algorithm renders the image below (still in 0.7 second of course):

Fig.4. Slightly modified version of the Julia fractal by displaying another variable zi in the rendering algorithm.

Without these new loop features introduced in the math parser, I would have been forced to do one of these two things in G’MIC, to get the same result:

Either, add some new options to the native command -mandelbrot to allow this new type of visualization. This basically means: Writing new pieces of C++ code, compiling a new version of G’MIC with these added features, package it and release it to make this available for everyone. Even if I already have some decent release scripts, this implies a lot of extra-work and packaging time. This is not like the user can get the new feature in a few minutes (if you’ve already used the filter update mechanism present in the G’MIC plug-in for GIMP, you know what I mean). And I don’t speak about all the possibilities I couldn’t think of (but the user will obviously need one day 🙂 ), when adding such new display options to a native command like -mandebrot…

Or, write a new G’MIC custom script able to compute the same kind of result. This would be indeed the best way to make it available for other people quickly. But here, as the algorithm is very specific and works at a pixel level, doing it as a pipeline of macro-operators is quite a pain. It means I would have to use 3 nested -repeat...-done loops (which are basically the loops commands used in G’MIC pipelines) and it would probably take ages to render, as a G’MIC pipeline is always purely interpreted without any pre-compilation steps. Even by using multi-threading, it would have been a nightmare to compute here.

In fact, the quite long math expression we are using in command -julia_expr2 defines one complex algorithm as a whole, and we know it will be pre-compiled into a sequence of bytecodes by CImg before being evaluated for each of the 1024×1024=1,048,576 pixels that compose the image. Of course, we are not as fast as a native C++ implementation of the same command, but at the same time, we gain so much flexibility and genericity in the stuffs we can do now, that this disadvantage is easily forgiven. And the processing time stays reasonable. For fast algorithm prototyping, this feature seems to be incredibly nice! I won’t be forced to unsheathe my C ++ compiler every time I want to experiment some very specific image processing algorithms working at the pixel level.

4. A more serious example: the “Non-Local Means”

The Non-Local-Means is a quite famous patch-based denoising/smoothing algorithm in image processing, introduced in 2005 by A. Buades (beware, his home page contains images of Lena, please do not click if you are too sensitive!). I won’t enter in all the implementation details, as several different methods has been proposed in the literature just for implementing it. But one of the most simple (and slowest) technique requires 4 nested loops per image pixel. What a good opportunity to try writing this “slow” algorithm using the G’MIC-fill function! It took me less than 10 minutes, to be honest:

which results on these two images displayed on the screen, the noisy version (left), and the denoised one using the Non-Local-Means algorithm (right). Of course, the timing may differ from one machine to another. I guess my 3 seconds run here is seemly (tested on my powerful PC at the lab). It still takes less than 20 seconds on my laptop. A crop of the results are presented below. The initial Leno image is a 512×512 RGB image, and the timing has been measured for processing the whole image, of course.

Fig.5.1. Crop of a noisy version of the Leno image, degraded with gaussian noise, std=20.

Fig.5.2. Denoised version using the NL-means algorithm (custom command -nlmeans_expr).

Here again, you could argue that the native G’MIC command -denoise does the same thing and is faster to run. It is, definitely. Jérome Boulanger (one very active G’MIC contributor) has written a nice custom command -nlmeans that implements the NL-means with a smarter algorithm (avoiding the need for 4 nested loops per pixel) which runs even faster (and it is already available in the plug-in for GIMP). But that’s not the point. What I show here is I’m now able to do some (relatively fast) prototyping of algorithms working at a pixel level in G’MIC, without having to write and compile C++ code. But the best of all is about integration: if the algorithm appears to be interesting/effective enough, I can add it to the G’MIC standard library in a few minutes, and quickly create a filter for the GIMP plug-in as well. Let say it right away, probably 5 minutes after I’ve finished writing the first version of the algorithm, the plug-in users should be able to get it and use it on their own images (and give positive/negative feedback to help for future improvements). That’s what I call a smooth, quick and painless integration! And that is exactly the kind of algorithms I couldn’t implement before as custom G’MIC commands running at a decent speed.

To me, it clearly opens exciting perspectives to quickly prototype and integrate new custom image processing algorithms into G’MIC in the future!

5. The Vector Painting filter

In fact, this has happened earlier than expected. I’ve been able to add one of my latest image filter (named Vector painting) in the G’MIC plug-in for GIMP recently. It has been somehow unexpected, because I was just doing some debugging for improving the CImg math expression evaluator. Briefly, suppose you want to determine for each pixel of an image, the discrete spatial orientation of the maximal value variation, with an angular precision of 45°: For each pixel centered in a 3×3 neighborhood, I want to estimate which pixel of the neighborhood has the highest difference with the center pixel (measured as the squared difference between the two pixel values). To make things simpler, I’ve considered doing this on the image luminance only instead of using all the RGB color channels. At the end, I transform each pixel value into a label (an integer in range [1,8]) that represents one of the possible 45°-orientations of the plane. That was typically the kind of problems that would require the use of custom loops working at a pixel level, so something I couldn’t do easily before the loop feature introduced in my math parser (or I would have done the prototype in C++).

The solution to this problem was surprisingly easy to write. Here again, it didn’t take much more than 5 minutes of work:

We get this result (after re-normalization of the label image in range [0,255]). Keep in mind that each pixel of the resulting image is an integer label originally in range [1,8]. And by the way, the computation time is ridiculous here (178ms for this 512×512 image).

Fig.6. Each pixel of the Leno image is replaced by a label saying about which of its 3×3 neighbors is the most different from the central pixel.

It actually looks a bit ugly. But that’s not surprising, as the original image contains noise and you’ll get a lot of small random variations in flat regions. As a result, the labels you get in those regions are noisy as well. Now, what happens when we blur the image before computing the labels? That should regularize the resulting image of labels as well. Indeed:

$ gmic user.gmic leno.png --blur 1% -foo

returns this:

Fig.7. Each pixel of the blurred Leno image is replaced by a label saying about which of its 3×3 neighbors is the most different.

That’s interesting! Blurring the input image creates larger portions of constant labels, i.e. regions where the orientation of the maximal pixel variations is the same. And the original image contours keep appearing as natural frontiers of these labelled regions. Then, a natural idea would be to replace each connected region by the average color it overlays in the original color image. In G’MIC, this can be done easily with the command -blend shapeaverage.

$ gmic user.gmic leno.png --blur 1% -foo[-1] -blend shapeaverage

And what we get at the end is a nice piecewise constant abstraction of our initial image. Looks like a “vector painting”, no ? 😉

Fig.8. Result of the “shape average” blending between the original Leno image, and its map of labels, as obtained with command -foo.

As you may imagine, changing the amplitude of the blurring makes the result more or less abstract. Having this, It didn’t take too much time to create a filter that could be run directly from the G’MIC plug-in interface for GIMP. That’s the exact code I wrote to integrate my initial algorithm prototype in G’MIC and make it usable by everyone. It was done in less than 5 minutes, really:

Here again, that is how I conceive things should be done properly: 1. I create a quick algorithm prototype to transform an image into something else. 2. I decide that the algorithm is cool enough to be shared. 3. I add few lines to make it available immediately in the G’MIC image processing framework. What a gain of time compared to the time it would have taken by doing this in C++!

6. Comparison with ImageMagick’s -fx operator

While working on the improvement of my math expression evaluator in CImg, I’ve been wondering if what I was doing was not already existing in ImageMagick. Indeed, ImageMagick is one of the most well established open-source image processing framework, and I was almost sure they had already cope with the kind of questions I had for G’MIC. And of course, they had 🙂

So, they have a special operator -fx expression in convert that seems to be equivalent to what the G’MIC command -fill expression does. And yes, they probably had it for years, long before G’MIC even existed. But I admit I’ve almost completely stopped using ImageMagick tools when I’ve started developing my own C++ image processing library CImg, years ago. All the information you need to use this -fx operator in convert can be found on this documentation page, and even more examples on this page. Reading these pages was very instructive: I’ve noticed some interesting functions and notations they have in their own expression parser that I didn’t already have in mine (so, I’ve added some of them in my latest version of CImg!). Also I was particularly interested by this quote from their pages:

As people developed new types of image operations, they usually prototype it using a slow “-fx” operator first. When they have it worked out that ‘method’ is then converted into a new fast built-in operator in the ImageMagick Core library. Users are welcome to contribute their own “-fx” expressions (or other defined functions) that they feel would be a useful addition to IM, but which are not yet covered by other image operators, if they can be handled by one of the above generalized operators, it should be reasonably easy to add it.(…). What is really needed at this time is a FX expression compiler, that will pre-interpret the expression into a tighter and faster executable form. Someone was going to look into this but has since disappeared.

So it seems their -fx operator is quite slow as it re-parses the specified math expression for each image pixel. And when someone writes an interesting operator with -fx, they are willing to convert it into a C code and integrate this new built-in operator directly in the core ImageMagick library. It seems they don’t really mind adding new nativehard-coded operators into IM, maybe even for very specific/unusual operators (at least they don’t mention it). That’s interesting, because that is precisely what I’m trying to avoid in G’MIC. My impression is that it’s often acceptable to be less efficient if the code we have to write for adding one feature is smaller, easier to maintain/upgrade and does not require releasing a new version to make this particular feature available. Personally, I’d always prefer to write a G’MIC custom command (so, a script that I can directly put in the G’MIC standard library) if possible, instead of adding the same feature as a new “native” built-in command (in C++). But maybe their -fx operator was that slow it was cumbersome to use in practice? I had to try!

And I’m a bit sorry to say this, but yes, it’s quite slow (and I have tested this on my pretty fast machine with 12 HT cores at 2.6Ghz). The ImageMagick -fx operator is able to use multiple cores which is clearly a good thing, but even with that, it is cumbersome to use on reasonably big images, with complex math expressions. In a sense, that reassures me about the usefulness of having developed my own math expression compiler in CImg. This pre-compilation step of the math expression into a shorter bytecode sequence seems then to be almost mandatory. I’ve done a quick timing comparison for some simple image effects that can be achieved similarly with both expression evaluators of G’MIC and ImageMagick. Most of the examples below have been actually taken from the -fx documentation pages. I’m dividing and multiplying my image values by 255 in the G’MIC examples below because ImageMagick formulas assume that RGB values of the pixels are defined in range [0,1]. These tests have been done with a high-resolution input image (of a motorbike) with size 3072×2048, in RGB mode. I’ve checked that both the ImageMagick and G’MIC invokations render the same images.

The pre-compilation of the math expressions clearly makes a difference!

I would be really interested to compare the expression evaluators for more complex expressions, as the one I’ve used to compute Julia fractals with G’MIC for instance. I don’t have a deep knowledge of the ImageMagick syntax, so I don’t know what would be the equivalent command line. If you have any idea on how to do that, please let me know! I’d be interested also to get an idea on how Matlab is performing for the same kind of equations.

7. Conclusion and perspectives

What I conclude from all of this ? Well, I’m actually pretty excited by what the latest version of my expression evaluator integrated in G’MIC / CImg can finally do. It looks like it runs at a decent speed, at least compared to the one used in ImageMagick (which is definitely a reference project for image processing). I had also the idea of comparing it with GraphicsMagick, but I must admit I didn’t find the same -fx operator in it. And I didn’t find something similar (Maybe you could help teaching me how it works for GraphicsMagick?).

I’ve been already able to propose one (simple) artistic filter that I find interesting (Vector painting), and I’m very confident that these improvements of the math expression evaluator will open a lot of new possibilities for G’MIC. For allowing the design of new filters for everyone of course, but also to make my algorithm prototyping work easier and faster.

Could it be the beginning of a new boost for G’MIC? What do you think?

After writing my last article on the Easy (user-guided) foreground extraction algorithm, I realized that you could maybe think I was exaggeratedly arguing the whole algorithm can be re-coded very quickly from scratch. After all, I’ve just illustrated how the things work by using G’MIC command lines, but there is already a lot of image processing algorithms implemented in G’MIC, so it is somehow a biased demonstration. So let me just give you the corresponding C++ code for the algorithm. Here again, you may find I cheat a little bit, because I use some functions of a C++ image processing library (CImg) that actually doe most of the hard implementation work for me. But:
1. You can use this library too in your own code. The CImg Library I use here works on multiple platforms and has a very permissive license, so you probably don’t have to re-code the key image processing algorithms by yourself. And even if you want to do so, you can still look at the source code of CImg. It is quite clearly organized and function codes are easy to find (the whole library is defined in a single header file CImg.h).
2. I never said the whole algorithm could be done in few lines, only that it was easy to implement 🙂 So, dear C++ programmer, here is the simple prototype you need to implement the algorithm:

To compile it, using g++ on Linux for instance, you have to type something like this in a shell (I assume you have all the necessarily headers and libraries installed):

$ g++ -o foo foo.cpp -Dcimg_use_png -lX11 -lpthread -lpng -lzsses

Now, execute it:

$ ./foo

and you get this kind of result:

Once you have the resulting image of propagated labels, it is up to you to use it the way you want to split the original image into foreground/background layers or keep it as a mask associated to the color image. The point is that even if you add the code of the important CImg methods I’ve used here, i.e. CImg::get_blur(), CImg::get_gradient() and CImg::watershed(), you won’t probably exceed about 500 lines of C++ code. Yes, instead of a single line of G’MIC code. Now you see why I’m also a big fan of coding image processing stuffs directly in G’MIC instead of C++ ;). A last remark before ending this addendum: Unfortunately, the label propagation algorithm itself is hardly parallelizable. It is based on a priority queue and basically, the choice of a new label to set depends on how the latest label has been set. This is a sequential algorithm by nature. So, when processing large images, a good idea is to do the computations and preview the foreground extraction result only on a smaller preview of the original image. Then compute the full-res mask only once, after all key points have been placed by the user. Well that’s it, happy coding!