Boolean variables are stored as 8-bit integers with the value 0 for false and 1 for true.
Boolean variables are overdetermined in the sense that all operators that have Boolean
variables as input check if the inputs have any other value than 0 or 1, but operators that
have Booleans as output can produce no other value than 0 or 1. This makes operations
with Boolean variables as input less efficient than necessary.

Is this still true today and on what compilers? Can you please give an example? The author states

The Boolean operations can be made much more efficient if it
is known with certainty that the operands have no other values than 0 and 1. The reason
why the compiler doesn't make such an assumption is that the variables might have other
values if they are uninitialized or come from unknown sources.

Does this mean that if I take a function pointer bool(*)() for example and call it, then operations on it produce inefficient code? Or is it the case when I access a boolean by dereferencing a pointer or reading from a reference and then operate on it?

Hi Johannes - long time no see! But I think your first quote is wrong, it's somethimg like "Boolean variables are overdetermined in the sense that ..."
– Neil ButterworthNov 11 '17 at 23:43

2

IIRC, gcc and clang will sometimes emit code that depends on a bool being 0 or 1, not just any non-zero value. (even if that bool was generated by external code they can't analyze, like a function arg.) I'll see if I can cook up an example later.
– Peter CordesNov 11 '17 at 23:51

Isn't the code like it is because of short-circuit evaluation?
– alainNov 12 '17 at 0:03

@alain reading from the variable in the right side has no side effects so short circuit evaluation isn't an issue here, I think.
– Johannes Schaub - litbNov 12 '17 at 0:05

3 Answers
3

TL:DR: current compilers still have bool missed-optimizations when doing stuff like(a&&b) ? x : y. But the reason why is not that they don't assume 0/1, they just suck at this.

Many uses of bool are for locals, or inline functions, so booleanizing to a 0 / 1 can optimize away and branch (or cmov or whatever) on the original condition. Only worry about optimizing bool inputs / outputs when it does have to get passed/returned across something that doesn't inline, or really stored in memory.

Possible optimization guideline: combine bools from external sources (function args / memory) with bitwise operators, like a&b. MSVC and ICC do better with this. IDK if it's ever worse for local bools. Beware that a&b is only equivalent to a&&b for bool, not integer types. 2 && 1 is true, but 2 & 1 is 0 which is false. Bitwise OR doesn't have this problem.

IDK if this guideline will ever hurt for locals that were set from a comparison within the function (or in something that inlined). E.g. it might lead the compiler to actually make integer booleans instead of just using comparison results directly when possible. Also note that it doesn't seem to help with current gcc and clang.

Yes, C++ implementations on x86 store bool in a byte that's always 0 or 1 (at least across function-call boundaries where the compiler has to respect the ABI / calling convention which requires this.)

Compilers do sometimes take advantage of this, e.g. for bool->int conversion even gcc 4.4 simply zero-extends to 32-bit (movzx eax, dil). Clang and MSVC do this, too. C and C++ rules require this conversion to produce 0 or 1, so this behaviour is only safe if it's always safe to assume that a bool function arg or global variable has a 0 or 1 value.

Even old compilers typically did take advantage of it for bool->int, but not in other cases. Thus, Agner is wrong about the reason when he says:

The reason why the compiler doesn't make such an assumption is that the variables might have other values if they are uninitialized or come from unknown sources.

MSVC CL19 does make code that assumes bool function args are 0 or 1, so the Windows x86-64 ABI must guarantee this.

In the x86-64 System V ABI (used by everything other than Windows), the changelog for revision 0.98 says "Specify that _Bool (aka bool) is booleanized at the caller." I think even before that change, compilers were assuming it, but this just documents what compilers were already relying on. The current language in the x86-64 SysV ABI is:

3.1.2 Data Representation

Booleans, when stored in a memory object, are stored as single byte objects the value of which is always 0 (false) or 1 (true). When stored in integer registers (except for passing as arguments), all 8 bytes of the register are significant; any nonzero value is considered true.

The second sentence is nonsense: the ABI has no business telling compilers how to store things in registers inside a function, only at boundaries between different compilation units (memory / function args and return values). I reported this ABI defect a while ago on the github page where it's maintained.

3.2.3 Parameter passing:

When a value of type _Bool is returned or passed in a register or on the stack, bit 0 contains the truth value and bits 1 to 7 shall be zero16.

(footnote 16): Other bits are left unspecified, hence the consumer side of those values can rely on it being 0 or 1 when truncated to 8 bit.

The language in the i386 System V ABI is the same, IIRC.

Any compiler that assumes 0/1 for one thing (e.g. conversion to int) but fails to take advantage of it in other cases has a missed optimization. Unfortunately such missed-optimizations still exist, although they are rarer than when Agner wrote that paragraph about compilers always re-booleanizing.

(Clang's or dil, sil / mov eax, edi is silly: it's guaranteed to cause a partial-register stall on Nehalem or earlier Intel when reading edi after writing dil, and it has worse code size from needing a REX prefix to use the low-8 part of edi. A better choice might be or dil,sil / movzx eax, dil if you want to avoid reading any 32-bit registers in case your caller left some arg-passing registers with "dirty" partial registers.)

ICC emits the same code even for bool bitwise_or(bool a, bool b) { return a|b; }. It promotes to int (with movzx), and uses or to set flags according to the bitwise OR. This is dumb compared to or dil,sil / setne al.

For bitwise_or, MSVC does just use an or instruction (after movzx on each input), but anyway doesn't re-booleanize.

Missed optimizations in current gcc/clang:

Only ICC/MSVC were making dumb code with the simple function above, but this function still gives gcc and clang trouble:

Looks simple enough; you'd hope that a smart compiler would do it branchlessly with one test/cmov. x86's test instruction sets flags according to a bitwise AND. It's an AND instruction that doesn't actually write the destination. (Just like cmp is a sub that doesn't write the destination).

But even the daily builds of gcc and clang on the Godbolt compiler explorer make much more complicated code, checking each boolean separately. They know how to optimize bool ab = a&&b; if you return ab, but even writing it that way (with a separate boolean variable to hold the result) doesn't manage to hand-hold them into making code that doesn't suck.

@Mgetz: static performance analysis is pretty trivial here because the compilers are making code that does the same work + more (so worse throughput, latency, and uop count), except for branching vs. branchless. Two separate test/cmov are always worse than one test/cmov, the way compilers are using them. See agner.org/optimize to learn more about the pipelines of modern x86 CPUs. See also some of my answers, like stackoverflow.com/questions/45113527/….
– Peter CordesNov 13 '17 at 16:24

Here, char is checked whether it is 0, or not, and bool value set to 0 or 1 accordingly.

So I think it is safe to say that the compiler uses bool in a way so it always contains a 0/1. It never checks its validity.

About efficiency: I think bool is optimal. The only case I can imagine, where this approach is not optimal is char->bool conversion. That operation could be a simple mov, if bool value wouldn't be restricted to 0/1. For all other operations, the current approach is equally good, or better.

EDIT: Peter Cordes mentioned ABI. Here's the relevant text from the System V ABI for AMD64 (the text for i386 is similar):

Booleans, when stored in a memory object, are stored as single byte
objects the value of which is always 0 (false) or 1 (true). When
stored in integer registers (except for passing as arguments), all 8
bytes of the register are significant; any nonzero value is considered
true

So for platforms which follow SysV ABI, we can be sure that a bool has a 0/1 value.

I searched for ABI document for MSVC, but unfortunately I didn't find anything about bool.

This answer (and geza's answer) are both misleading because they fail to look at the big picture. When you declare a function as bool foo(bool a, bool b), the implementation of the function will assume that a and b are valid bools, i.e. can only have the value 0 or 1. It's the caller's responsibility to convert other types to bool. So if you look at the code that the compiler generates to call the function, you will see the testl+setne code (assuming that the caller is starting with some non-bool type like int).
– user3386109Nov 12 '17 at 0:59

3

@user3386109: I don't consider it misleading. The code shows that a function dealing with bools doesn't expend effort worrying about such conversions, and by extension - if the caller has accepted bool arguments and passed them through, retrieved/passed bool return values from functions called etc. it won't be constantly and redundantly expending effort on paranoid conversions that are safe for non 0/1 values, which is IMHO the worrying implication of Agner's statements. Of course some conversions must happen when they're actually functionally required.
– Tony DelroyNov 12 '17 at 1:32