Generic: Min and Max Redivivus

By Andrei Alexandrescu, April 01, 2001

C++ is a remarkably expressive language, and a large part of that expressiveness is due to templates. Still, there are some things that are hard to do well in C++, seemingly much harder than they ought to be. Nothing illustrates this state of affairs as clearly as the simple min and max functions from the Standard C++ library. These functions have traditionally been implemented with macros, and macros are generally frowned upon by C++ gurus. But to get the equivalent functionality without macros requires dozens of lines of hard-to-read code. In this installment, Andrei Alexandrescu presents a couple template implementations of min and max, answering a challenge laid down by Scott Meyers back in 1995. These implementations illustrate just how hard it is to beat the simple macros. This leads Andrei to speculate about what is wrong with programming languages today and whether we have been led astray in adopting a mindset that views everything as an "object."

April 2001 C++ Experts Forum/Generic<Programming>

In January 1995, Scott Meyers, in his C++ Report article entitled "min, max, and more" [1], makes a challenge to the C++ community. After a careful analysis of the macro-based implementation of min and max and a comparison with the (back then) state-of-the-art template-based implementations, he concludes:

What, then, is the best way  the correct way  to implement max? In the immortal words of Tevye, "I'll tell you: I don't know." Faced with the above analysis, I increasingly find myself telling people that the macro approach may well be best, and I hate macros. If you know of a superior way to implement max, please let me know.

To the best of my knowledge, the challenge is as valid today as it was six years ago, and this article takes the challenge. But before I start, let's wrap up the previous installment of Generic<Programming> [2].

Volatile Substance Abuse?

I have received much input following my February column "Generic<Programming>: volatile  Multithreaded Programmer's Best Friend" [2]. As fate sometimes has it, I received most of the kudos in form of private email, while the gripes went to the Usenet newsgroups comp.lang.c++.moderated and comp.programming.threads. The ensuing debates have been fiery and lengthy, and if you have an interest in the subject, you may want to check them out. The thread is entitled "volatile, was: memory visibility between threads."

I know I learned a lot from that thread. For one thing, the Widget example in the opening of the article is irrelevant. To make a long story short, there are systems (such as the POSIX-compliant ones) on which the volatile modifier is not needed, and there are other systems on which adding volatile will not help, leaving the program incorrect.

The most important problem with volatile correctness is that it relies on POSIX-like mutexes, and there are multiprocessor systems on which mutexes are not enough  you have to use memory barriers.

Another more philosophical problem is that strictly speaking, casting volatile away off a variable is illegal, even if it's you who added the volatile qualifier yourself to apply volatile correctness! As Anthony Williams notes, a system could conceivably store volatile data in a different storage than non-volatile data so that casting addresses would behave erratically.

Yet another critique was that volatile correctness, while it can solve race conditions at a lower level, cannot properly detect higher-level, logical race conditions. For example, say you have an mt_vector class template that emulates an std::vector, but has properly synchronized member functions. Consider:

The intent is to remove the last element in a vector, if any. The code above acts perfectly kosher in a single-threaded environment. If, however, you use mt_vector in a multithreaded program, the code might throw an exception even though empty and pop_back are properly synchronized. So the coherence of the low-level data (vec) is properly preserved, yet the higher-level operation is erratic.

At any rate, after all the discussion, I maintain my recommendation of volatile correctness as a valuable tool for detecting race conditions on systems supporting POSIX-like mutexes. But if you work on a multiprocessor system sporting memory-access reordering, you may want to peruse your compiler's documentation first. You know who you are.

And finally, Kenneth Chiu mentions a very interesting paper at
http://theory.stanford.edu/~freunds/race.ps, paper entitled, guess what,
"Type-Based Race Detection for Java." The paper describes how a small number
of additions to Java's type system make it possible for the compiler, in
conjunction with the programmer, to detect race conditions at compile time.

Eppur si muove.

Min and Max

The macro works very nicely in that it's completely generic (supports any expressions for which operator< and operator?: make sense). Unfortunately, min always evaluates one of it arguments twice, and this can lead to a lot of confusion. It just looks too much like a function in usage, and it doesn't behave like one. (For an extended discussion on the problems of min and of macros in general, refer to Herb's discussion [3].)

A simple and effective template-based solution, present in the C++ Standard library, looks like this:

As you see, this solution puts const everywhere (arguments and result), which is one of its problems. Imagine you want to do this: Increase the minimum of floating-point values a and b by two. Then you would want to write:

double a, b;
...
min(a, b) += 2;

This works nicely with the macro-based min, but not with the templated one, because you cannot modify a const object. As Scott notes, adding a second version:

MaxResult<int, short int> supports two conversions: to int& and to short int&. Consequently, the compiler is equally motivated to invoke any of the two overloads of Fun and, like Buridan's donkey, dies in between two equally attractive options. Again, the macro-based solution passes this test, too  the code invokes Fun(int) as you would expect.

Quest for a Type

What would genuinely solve the problem would be a device that, given two types L and R, computes the proper type of min(L, R). For example, if L is char and R is int, the result should be int, and so on. Assuming we have such a device (let's call it MINTYPE), we can write the definitive solution to min as four functions:

That works, provided you write an awful lot of code. There are 14 arithmetic types, and you have to write specializations of MinTraits for all combinations thereof. Then you have to add the const variants in. There are tricks you can do to simplify this task, like using, well, um, macros, but it's still not a very elegant solution.

Even then, the solution is incomplete. You have to take pointers and user-defined classes into account. Plus, what about calling Min for base and derived classes? Consider you define a class Shape and define operator< to order Shape objects by their area.

Now really, wouldn't it be nice if Min invoked above would magically figure out that Rectangle derives from Shape and return a reference to Shape? That would make a lot of sense because a reference to Rectangle is automatically convertible to a reference to Shape.

But... by the time you start to wish this, you dream of more than what the macro could give. The macro doesn't work correctly in the example above, because the expression:

shape < frame ? shape : frame

converts both parts to the same type, so it is equivalent to:

shape < frame ? shape : Shape(frame)

which doesn't do what we want. (Instead, it does a very bad thing called slicing.)

This article implements Min so that you get every nice thing you could have possibly gotten from the macro-based version, plus more. Better yet, the implementation has a reasonable size  about 80 lines of code in all (Max included, too). Interested? Reheat that coffee in the microwave and let's talk.

Loki

Okay, I lied. There are 80 lines of code only if you don't count the library. More specifically, the code uses Loki, a generic library that's featured in my upcoming book [5]. Among other things, Loki provides advanced type manipulation means. The tools in Loki used by Min are:

builds a typelist containing three types and stores it in FloatingPointTypes. Given a typelist such as FloatingPointTypes and an arbitrary type T, you can find out on what position, if any, T is in that typelist by using the compile-time algorithm Loki::TL::IndexOf. For example:

Loki::TL::IndexOf<FloatingPointTypes, double>::value

evaluates to 1. If the type is not found in the typelist, the result is -1.

The second tool we'll use is the Select class template, which is thoroughly described in [7]. In short, Select allows you to select one of two types, based upon a compile-time Boolean constant. For example:

defines SmallInt to be the smallest integral type among wchar_t and short int.

TypeTraits is a class template that makes all kind of deductions about a type, such as "Is this type a pointer? To what does it point to?" etc. We'll use only the NonConstType type definition inside TypeTraits. TypeTraits<T>::NonConstType is a typedef that removes the const qualifier, if any, from T.

Last, but not least, we'll use the Conversion class described in [7], a class that detects whether an arbitrary type can be implicitly converted to another. Conversion is the cornerstone to implement the magic mentioned above regarding Shape and Rectangle.

The MinMaxTraits Class Template

To simplify type computation, I established a simple linear hierarchy on arithmetic types. Basically I put all arithmetic types in a specific order, and I postulated that the type of Min's result is the type that's toward the bottom of the list. Here's the list by the way (ignore the const for now):

In essence, unsigned types come after signed types, larger types come after smaller types, and floating-point types come after integral types. For example, if you pass Min a long int and a double, the result will have type double, because double is after long int in the ArithTypes list.

Now the general algorithm for figuring out Min's result type, if you pass it two non-reference types L and R, is as follows:

Assume the Result is R.

If an R can be implicitly converted to an L, then change Result to L.

If L and R are arithmetic types and R comes after L in Private::ArithTypes above, change Result to R. This step takes care of all the math conversions.

If an L& can be automatically converted to an R& without the conversion involving a temporary, then change Result to R&. This step ensures that calls such as Min(frame, shape) return a Shape&.

If an R& can be automatically converted to an L& without the conversion involving a temporary, then change Result to L&. This step ensures that calls such as Min(shape, frame) return a Shape&.

You can see MinMaxTraits' implementation in the downloadable code. The hardest part is to figure out the "without the conversion involving a temporary" part in the algorithm above.
In essence, T is convertible to U without a temporary if a
reference to non-constT is convertible to a reference to non-constU.

The Min and Max Overloads

There are four Min and four Max overloads, corresponding to the four combinations of const and non-const argument types. To avoid the slicing problem discussed in the Shape/Rectangle example above, Min has a body that's slightly different from the classic a < b ? a : b. Here it is:

Supports both const and non-const arguments (including mixing the two in a single call). Thanks to the four overloads, Min supports any combinations of const and non-const arguments.

Supports arguments of different types (where that makes sense). Min does support arguments of different types and actually has a fair amount of intelligence that's inaccessible to both the macro and the simple templated solution: Min disambiguates between various arithmetic types like a champ and takes initiative in performing conversions that make sense. The conversion selection process (based upon Private::ArithTypes) is under the control of the library writer.

Min works properly with pointers (even pointers pointing to different, but related, types, such as Shape* and Rectangle*). This is due to the first step in the algorithm.

A remarkable feature of Min is that it deduces its result type by using an
algorithm that you can configure, as opposed to staying inside a predefined
type system. If you find the algorithm unsatisfactory, you can tweak it to
do pretty much what you want, including semantics-directed typing. For
example, the minimum between an unsigned char and an unsigned int will
always have the type unsigned char, because unsigned char's range is
included in unsigned int's range. You can achieve such "smart" typing by
changing the type deduction algorithm.

It would all be so nice, but there's a little detail worth mentioning. Sadly, Min doesn't work with any compiler I have access to. In fairness,
each compiler chokes on a different piece of code. I know the code is
correct because a loosely-defined reunion of compilers would compile it, but
then I haven't seen a working example yet. So if you have access to a modern
compiler and could give the code a try, please let me know.

Look Ahead in Anger

These days I'm reading The Age of Spiritual Machines by Ray Kurzweil [8]. Kurzweil argues, and rather convincingly, that in the 2020s you'll be able to buy a machine with the power of a human brain for $1,000.

Well I can't repress a smile when thinking of how people  or maybe myself, hopefully only a bit older and a lot wiser  will look at this article in 20 years. "Amazing, in 2001 these guys were having trouble implementing generically the min and max no-brainers in the most popular programming language of the time. Hah, it took this guy an entire article and some esoteric techniques to get min and max right."

Maybe min and max aren't important? I'd argue to the contrary. Minimum and maximum are two simple concepts present both in math and in the real life. If a programming language is unable to express simple concepts of math and life, then there is something deeply wrong with that language. "Mom, I don't care about vocabulary and grammar. I just want to write poetry!" If you have to throw in a couple of temporaries and write "a < b ? a : b" when you actually think "min(expr1, expr2)", it means that you have a serious problem: you work with a machine that's able to compute the minimum of any two expressions, but is unable to express the concept of minimum.

There is something wrong here, isn't there? And C++ is not the only one to blame. Java and C#, two newer and supposedly superior languages, are utterly impotent at expressing min and max because, you know, min and max are not objects.

Maybe in the future this period will be called the "object frenzy." Who knows... but I can't help asking with chagrin: "Quo Vadis, Programmatorae?"

Acknowledgements

Many thanks are due to all participants in the volatile-related thread on the Usenet, especially Dave Butenhof, Kenneth Chiu, James
Kanze, Kaz Kylheku, Tom Payne, and David Schwartz.

Andrei Alexandrescu is a Development Manager at RealNetworks Inc. (www.realnetworks.com), based in Seattle, WA, and author of the acclaimed book Modern C++ Design. He may be contacted at www.moderncppdesign.com. Andrei is also one of the featured instructors of The C++ Seminar (www.gotw.ca/cpp_seminar).

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!