I can easily imagine this wasn't for the question itself but for spawning a very useful reference guide. Considering I got here when I needed a bit of information, anyway.
– Joey van HummelAug 11 '13 at 21:21

4

@glglgl /Jonathon it has enough significance for C++ to be tagged as such. It's a historic question with lots of traffic and the C++ tag will help fellow interested programmers find it through a google search.
– Luchian GrigoreNov 4 '13 at 18:57

1

I'm assuming you know how to turn on/off individual bits in a byte? (E.g., using hex?)
– RastaJediApr 19 '16 at 19:30

27 Answers
27

Setting a bit

That will set the nth bit of number. n should be zero, if you want to set the 1st bit and so on upto n-1, if you want to set the nth bit.

Use 1ULL if number is wider than unsigned long; promotion of 1UL << n doesn't happen until after evaluating 1UL << n where it's undefined behaviour to shift by more than the width of a long. The same applies to all the rest of the examples.

Clearing a bit

Use the bitwise AND operator (&) to clear a bit.

number &= ~(1UL << n);

That will clear the nth bit of number. You must invert the bit string with the bitwise NOT operator (~), then AND it.

Toggling a bit

The XOR operator (^) can be used to toggle a bit.

number ^= 1UL << n;

That will toggle the nth bit of number.

Checking a bit

You didn't ask for this, but I might as well add it.

To check a bit, shift the number n to the right, then bitwise AND it:

bit = (number >> n) & 1U;

That will put the value of the nth bit of number into the variable bit.

Changing the nth bit to x

Setting the nth bit to either 1 or 0 can be achieved with the following on a 2's complement C++ implementation:

number ^= (-x ^ number) & (1UL << n);

Bit n will be set if x is 1, and cleared if x is 0. If x has some other value, you get garbage. x = !!x will booleanize it to 0 or 1.

To make this independent of 2's complement negation behaviour (where -1 has all bits set, unlike on a 1's complement or sign/magnitude C++ implementation), use unsigned negation.

I would like to note that on platforms that have native support for bit set/clear (ex, AVR microcontrollers), compilers will often translate 'myByte |= (1 << x)' into the native bit set/clear instructions whenever x is a constant, ex: (1 << 5), or const unsigned x = 5.
– AaronSep 17 '08 at 17:13

1 is an int literal, which is signed. So all the operations here operate on signed numbers, which is not well defined by the standards. The standards does not guarantee two's complement or arithmetic shift so it is better to use 1U.
– Siyuan RenDec 10 '13 at 8:53

+1. Not that std::bitset is usable from "C", but as the author tagged his/her question with "C++", AFAIK, your answer is the best around here... std::vector<bool> is another way, if one knows its pros and its cons
– paercebalSep 19 '08 at 18:16

Maybe nobody mentioned it because this was tagged embedded. In most embedded systems you avoid STL like the plague. And boost support is likely a very rare bird to spot among most embedded compilers.
– LundinAug 18 '11 at 19:47

16

@Martin It is very true. Besides specific performance killers like STL and templates, many embedded systems even avoid the whole standard libraries entirely, because they are such a pain to verify. Most of the embedded branch is embracing standards like MISRA, that requires static code analysis tools (any software professionals should be using such tools btw, not just embedded folks). Generally people have better things to do than run static analysis through the whole standard library - if its source code is even available to them on the specific compiler.
– LundinAug 19 '11 at 6:26

35

@Lundin: Your statements are excessively broad (thus useless to argue about). I am sure that I can find situations were they are true. This does not change my initial point. Both of these classes are perfectly fine for use in embedded systems (and I know for a fact that they are used). Your initial point about STL/Boost not being used on embedded systems is also wrong. I am sure there are systems that don't use them and even the systems that do use them they are used judiciously but saying they are not used is just not correct (because there are systems were they are used).
– Martin YorkAug 19 '11 at 6:41

I've always found using bitfields is a bad idea. You have no control over the order in which bits are allocated (from the top or the bottom), which makes it impossible to serialize the value in a stable/portable way except bit-at-a-time. It's also impossible to mix DIY bit arithmetic with bitfields, for example making a mask that tests for several bits at once. You can of course use && and hope the compiler will optimize it correctly...
– R..Jun 28 '10 at 6:17

32

Bit fields are bad in so many ways, I could almost write a book about it. In fact I almost had to do that for a bit field program that needed MISRA-C compliance. MISRA-C enforces all implementation-defined behavior to be documented, so I ended up writing quite an essay about everything that can go wrong in bit fields. Bit order, endianess, padding bits, padding bytes, various other alignment issues, implicit and explicit type conversions to and from a bit field, UB if int isn't used and so on. Instead, use bitwise-operators for less bugs and portable code. Bit fields are completely redundant.
– LundinAug 18 '11 at 19:19

41

Like most language features, bit fields can be used correctly or they can be abused. If you need to pack several small values into a single int, bit fields can be very useful. On the other hand, if you start making assumptions about how the bit fields map to the actual containing int, you're just asking for trouble.
– FerruccioAug 18 '11 at 19:35

4

@endolith: That would not be a good idea. You could make it work, but it wouldn't necessarily be portable to a different processor, or to a different compiler or even to the next release of the same compiler.
– FerruccioMar 8 '12 at 21:02

3

@Yasky and Ferruccio getting different answers to a sizeof() for this approach should illustrate the problems with compatibility not just across compilers but across hardware. We sometimes fool ourselves that we've solved these issues with languages or defined runtimes but it really comes down to 'will it work on my machine?'. You embedded guys have my respect (and sympathies).
– Kelly S. FrenchDec 8 '16 at 16:11

Alternately you could make a clearbits() function instead of &= ~. Why are you using an enum for this? I thought those were for creating a bunch of unique variables with hidden arbitrary value, but you're assigning a definite value to each one. So what's the benefit vs just defining them as variables?
– endolithDec 20 '11 at 15:09

3

@endolith: The use of enums for sets of related constants goes back a long way in c programing. I suspect that with modern compilers the only advantage over const short or whatever is that they are explicitly grouped together. And when you want them for something other than bitmasks you get the automatic numbering. In c++ of course, they also form distinct types which gives you a little extras static error checking.
– dmckeeDec 22 '11 at 1:15

You'll get into undefined enum constants if you don't define a constant for each of the possible values of the bits. What's the enum ThingFlags value for ThingError|ThingFlag1, for example?
– Luis ColoradoSep 30 '14 at 10:55

5

If you use this method please keep in mind that enum constants are always of signed type int. This can cause all manner of subtle bugs because of implicit integer promotion or bitwise operations on signed types. thingstate = ThingFlag1 >> 1 will for example invoke implementation-defined behavior. thingstate = (ThingFlag1 >> x) << y can invoke undefined behavior. And so on. To be safe, always cast to an unsigned type.
– LundinDec 14 '15 at 9:25

1

@Lundin: As of C++11, you can set the underlying type of an enumeration, e.g.: enum My16Bits: unsigned short { ... };
– Aiken DrumMar 15 '16 at 15:01

The common expression that you seem to be having problems with in all of these is "(1L << (posn))". All this does is create a mask with a single bit on
and which will work with any integer type. The "posn" argument specifies the
position where you want the bit. If posn==0, then this expression will
evaluate to:

0000 0000 0000 0000 0000 0000 0000 0001 binary.

If posn==8, it will evaluate to:

0000 0000 0000 0000 0000 0001 0000 0000 binary.

In other words, it simply creates a field of 0's with a 1 at the specified
position. The only tricky part is in the BitClr() macro where we need to set
a single 0 bit in a field of 1's. This is accomplished by using the 1's
complement of the same expression as denoted by the tilde (~) operator.

Once the mask is created it's applied to the argument just as you suggest,
by use of the bitwise and (&), or (|), and xor (^) operators. Since the mask
is of type long, the macros will work just as well on char's, short's, int's,
or long's.

The bottom line is that this is a general solution to an entire class of
problems. It is, of course, possible and even appropriate to rewrite the
equivalent of any of these macros with explicit mask values every time you
need one, but why do it? Remember, the macro substitution occurs in the
preprocessor and so the generated code will reflect the fact that the values
are considered constant by the compiler - i.e. it's just as efficient to use
the generalized macros as to "reinvent the wheel" every time you need to do
bit manipulation.

Unconvinced? Here's some test code - I used Watcom C with full optimization
and without using _cdecl so the resulting disassembly would be as clean as
possible:

2 things about this: (1) in perusing your macros, some may incorrectly believe that the macros actually set/clear/flip bits in the arg, however there is no assignment; (2) your test.c is not complete; I suspect if you ran more cases you'd find a problem (reader exercise)
– DanOct 18 '08 at 1:51

17

-1 This is just weird obfuscation. Never re-invent the C language by hiding away language syntax behind macros, it is very bad practice. Then some oddities: first, 1L is signed, meaning all bit operations will be performed on a signed type. Everything passed to these macros will return as signed long. Not good. Second, this will work very inefficiently on smaller CPUs as it enforces long when the operations could have been on int level. Third, function-like macros are the root of all evil: you have no type safety whatsoever. Also, the previous comment about no assignment is very valid.
– LundinAug 18 '11 at 19:14

2

This will fail if arg is long long. 1L needs to be the widest possible type, so (uintmax_t)1 . (You might get away with 1ull)
– M.MFeb 6 '15 at 23:51

Did you optimize for code-size? On Intel mainstream CPUs you'll get partial-register stalls when reading AX or EAX after this function returns, because it writes the 8-bit components of EAX. (It's fine on AMD CPUs, or others that don't rename partial registers separately from the full register. Haswell/Skylake don't rename AL separately, but they do rename AH.).
– Peter CordesNov 10 '17 at 21:38

It's good to read but one should be aware of possible side effects. Using BITOP(array, bit++, |=); in a loop will most likely not do what the caller wants.
– foraidtJul 13 '10 at 8:27

Indeed. =) One variant you might prefer is to separate it into 2 macros, 1 for addressing the array element and the other for shifting the bit into place, ala BITCELL(a,b) |= BITMASK(a,b); (both take a as an argument to determine the size, but the latter would never evaluate a since it appears only in sizeof).
– R..Jul 13 '10 at 9:19

@R.. This answer is really old, but I'd probably prefer a function to a macro in this case.
– PC LudditeOct 23 '15 at 17:08

Minor: the 3rd (size_t) cast seem to be there only to insure some unsigned math with %. Could (unsigned) there.
– chuxSep 27 '17 at 17:58

The (size_t)(b)/(8*sizeof *(a)) unnecessarily could narrow b before the division. Only an issue with very large bit arrays. Still an interesting macro.
– chuxSep 27 '17 at 18:00

Pretty much everything about bit-fields is implementation-defined. Even if you manage to find out all details regarding how your particular compiler implements them, using them in your code will most certainly make it non-portable.
– LundinAug 18 '11 at 19:50

1

@Lundin - True, but embedded system bit-fiddling (particularly in hardware registers, which is what my answer relates to) is never going to be usefully portable anyway.
– RoddyAug 19 '11 at 20:13

1

Not between entirely different CPUs perhaps. But you most likely want it to be portable between compilers and between different projects. And there is a lot of embedded "bit-fiddling" that isn't related to the hardware at all, such as data protocol encoding/decoding.
– LundinAug 20 '11 at 9:35

...and if you get in the habit of using bit fields doing embedded programming, you'll find your X86 code runs faster, and leaner too. Not in simple benchmarks where you have the whole machine to crush the benchmark, but in real-world multi-tasking environments where programs compete for resources. Advantage CISC - whose original design goal was to make up for CPUs faster than buses and slow memory.
– user1899861Feb 15 '13 at 22:26

As this is tagged "embedded" I'll assume you're using a microcontroller. All of the above suggestions are valid & work (read-modify-write, unions, structs, etc.).

However, during a bout of oscilloscope-based debugging I was amazed to find that these methods have a considerable overhead in CPU cycles compared to writing a value directly to the micro's PORTnSET / PORTnCLEAR registers which makes a real difference where there are tight loops / high-frequency ISR's toggling pins.

For those unfamiliar: In my example, the micro has a general pin-state register PORTn which reflects the output pins, so doing PORTn |= BIT_TO_SET results in a read-modify-write to that register. However, the PORTnSET / PORTnCLEAR registers take a '1' to mean "please make this bit 1" (SET) or "please make this bit zero" (CLEAR) and a '0' to mean "leave the pin alone". so, you end up with two port addresses depending whether you're setting or clearing the bit (not always convenient) but a much faster reaction and smaller assembled code.

Micro was Coldfire MCF52259, using C in Codewarrior. Looking at the disassembler / asm is a useful exercise as it shows all the steps the CPU has to go through to do even the most basic operation. <br>We also spotted other CPU-hogging instructions in time-critical loops - constraining a variable by doing var %= max_val costs a pile of CPU cycles every time round, while doing if(var > max_val)var-=max_val uses only a couple of instructions. <br>A good guide to a few more tricks is here: codeproject.com/Articles/6154/…
– John UJun 19 '12 at 17:33

Even more importantly, the helper memory-mapped I/O registers provide a mechanism for atomic updates. Read/modify/write can go very badly if the sequence is interrupted.
– Ben VoigtFeb 22 '15 at 2:16

1

Keep in mind that all port registers will be defined as volatile and therefore the compiler is unable to perform any optimizations on code involving such registers. Therefore, it is good practice to disassemble such code and see how it turned out on assembler level.
– LundinDec 14 '15 at 9:42

Notes:
This is designed to be fast (given its flexibility) and non-branchy. It results in efficient SPARC machine code when compiled Sun Studio 8; I've also tested it using MSVC++ 2008 on amd64. It's possible to make similar macros for setting and clearing bits. The key difference of this solution compared with many others here is that it works for any location in pretty much any type of variable.

If you're doing a lot of bit twiddling you might want to use masks which will make the whole thing quicker. The following functions are very fast and are still flexible (they allow bit twiddling in bit maps of any size).

It's up to you to ensure that the bit number is within the range of the bit map that you pass. Note that for little endian processors that bytes, words, dwords, qwords, etc., map correctly to each other in memory (main reason that little endian processors are 'better' than big-endian processors, ah, I feel a flame war coming on...).

Don't use a table for a function that can be implemented with a single operator. TQuickByteMask[n] is equivalent to (1<<n). Also, making your arguments short is a very bad idea. The / and % will actually be a division, not bitshift/bitwise and, because signed division by a power of 2 cannot be implemented bitwise. You should make the argument type unsigned int!
– R..Jun 28 '10 at 6:24

What's the point with this? It only makes the code slower and harder to read? I can't see a single advantage with it. 1u << n is easier to read for C programmers, and can hopefully be translated into a single clock tick CPU instruction. Your division on the other hand, will be translated to something around 10 ticks, or even as bad as up to 100 ticks, depending on how poorly the specific architecture handles division. As for the bitmap feature, it would make more sense to have a lookup table translating each bit index to a byte index, to optimize for speed.
– LundinAug 18 '11 at 19:32

2

As for big/little endian, big endian will map integers and raw data (for example strings) in the same way: left-to-right msb to lsb throughout the whole bitmap. While little endian will map integers left to right as 7-0, 15-8, 23-18, 31-24, but raw data is still left-to-right msb to lsb. So how little endian is better for your particular algorithm is completely beyond me, it seems to be the opposite.
– LundinAug 18 '11 at 19:42

2

@R.. A table can be useful if your plattform can't shift efficiently, like old microchip mcu's, but of course then the division in the sample is absolutly inefficient
– jebNov 18 '11 at 11:28

set_bit Atomically set a bit in memory
clear_bit Clears a bit in memory
change_bit Toggle a bit in memory
test_and_set_bit Set a bit and return its old value
test_and_clear_bit Clear a bit and return its old value
test_and_change_bit Change a bit and return its old value
test_bit Determine whether a bit is set

Note: Here the whole operation happens in a single step. So these all are guaranteed to be atomic even on SMP computers and are useful
to keep coherence across processors.

Note there is nothing "special" about this code. It treats a bit like an integer - which technically, it is. A 1 bit integer that can hold 2 values, and 2 values only.

I once used this approach to find duplicate loan records, where loan_number was the ISAM key, using the 6-digit loan number as an index into the bit array. Savagely fast, and after 8 months, proved that the mainframe system we were getting the data from was in fact malfunctioning. The simplicity of bit arrays makes confidence in their correctness very high - vs a searching approach for example.

std::bitset is indeed implemented as bits by most compilers
– galinetteNov 17 '14 at 20:03

@galinette, Agreed. The header file #include <bitset> is a good resource in this regard. Also, the special class vector<bool> for when you need the size of the vector to change. The C++ STL, 2nd Edition, Nicolai M. Josuttis covers them exhaustively on pgs 650 and 281 respectively. C++11 adds a few new capabilities to std::bitset, of special interest to me is a hash function in unordered containers. Thanks for the heads up! I'm going to delete my brain-cramp comment. Already enough garbage out on the web. I don't want to add to it.
– user1899861Nov 17 '14 at 21:08

2

This uses at least a whole byte of storage for each bool. Maybe even 4 bytes for C89 setups that use int to implement bool
– M.MFeb 6 '15 at 23:55

@MattMcNabb, you are correct. In C++ the size of the int type necessary to implement a boolean is not specified by the standard. I realized this answer was in error some time ago, but decided to leave it here as people are apparently finding it useful. For those wanting to use bits galinette's comment is most helpful as is my bit library here ... stackoverflow.com/a/16534995/1899861
– user1899861Feb 12 '15 at 7:23

2

@RocketRoy: Probably worth changing the sentence that claims this is an example of "bit operations", then.
– Ben VoigtFeb 22 '15 at 2:20

To address a common coding pitfall when attempting to form the mask:1 is not always wide enough

What problems happen when number is a wider type than 1?x may be too great for the shift 1 << x leading to undefined behavior (UB). Even if x is not too great, ~ may not flip enough most-significant-bits.

Interesting look on an old question! Neither number |= (type_of_number)1 << x; nor number |= (number*0 + 1) << x; appropriate to set the sign bit of a signed type... As a matter of fact, neither is number |= (1ull << x);. Is there a portable way to do it by position?
– chqrlieSep 27 '17 at 22:27

1

@chqrlie IMO, the best way to avoid setting the sign bit and risking UB or IDB with shifts is to use unsigned types. Highly portable shift signed code is too convoluted to be acceptable.
– chuxSep 27 '17 at 22:33

This code is broken. (Also, why do you have ; after your function definitions?)
– melpomeneFeb 10 '18 at 20:11

@melpomene The code is not broken, I did test it. Do you mean that it will not compile or that the result is wrong? About the extra ';' I don't remember, those can be removed indeed.
– Joakim L. ChristiansenFeb 25 '18 at 15:51

Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).