I cannot help but feel stupid when I look at this. Did I miss a class or two in college or is this not something I am supposed to just get? I can do simple bit-wise operations (like ANDing, ORing, XORing, shifting), but come on, how does someone come up with a code like that above?

How good does a well-rounded programmer need to be with bit-wise operations?

On a side note...What worries me is that the person who answered my question on StackOverflow answered it in a matter of minutes. If he could do that, why did I just stare like deer in the headlights?

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
If this question can be reworded to fit the rules in the help center, please edit the question.

4

What type of development work do you do (or want to do, if you aren't doing it right now)? I don't see this being useful in web development, but I have seen a lot of bitwise operations in embedded systems work.
–
Thomas Owens♦Oct 28 '11 at 10:21

26

If I'm hiring someone to do user interface development or web development, bit manipulation isn't something I'd ask about because, chances are, they will never see it. However, I would expect someone working with network protocols, embedded systems, and device driver work to be familiar with it.
–
Thomas Owens♦Oct 28 '11 at 11:14

13 Answers
13

I would say that as a well-rounded developer, you need to understand the operators and bitwise operations.

So, at a minimum, you should be able to figure out the code above after a bit of thinking.

Bitwise operations tend to be rather low level, so if you work on websites and LOB software, you are unlikely to use them much.

Like other things, if you don't use them much, you wouldn't be conversant in them.

So, you shouldn't worry about someone being able to figure it out very quickly, as they (probably) work with this kind of code a lot. Possibly writing OS code, driver code or other tricky bit manipulation.

+1: Bitwise operations are an important bit of knowledge (no pun intended) for any developer, but they're only really really crucial in specific situations now. If you've never come across them in your day-to-day then having a general knowledge is better than slaving over them. Keep that brain space free.
–
Nicholas SmithOct 28 '11 at 13:21

You should also understand when you would use them and not shy away from their use if they are the correct solution for the problem at hand.
–
user606723Oct 28 '11 at 14:21

To add to @user606723 's comment -- there's really but a few places where bitwise stuff is usually used and that are more or less commonly encountered -- hashing (and stuff related to it) and extracting/setting particular colors of RGB if they're stored in an int. E.g., CPU info can be read by checking bit flags returned from a specific register, but that involves asm and usually has higher lvl wrappers if needed.
–
TC1Oct 28 '11 at 17:46

If you understand how to solve problems like "determine if bits 3 and 8 are set," "clear bit 5" or "find the integer value represented by bits 7-12" you have enough of an understanding of bitwise operators to check the Can Twiddle Bits box on the "well-rounded" checklist.

What's in your example comes from Hacker's Delight, a compilation of high-performance algorithms for manipulating small bits of data like integers. Whoever wrote that code originally didn't just spit it out in five minutes; the story behind it is more likely that there was a need for a fast, branch-free way to count bits and the author had some time to spend staring at strings of bits and cooking up a way to solve the problem. Nobody's going to understand how it works at a glance unless they've seen it before. With a solid understanding of bitwise basics and some time spent experimenting with the code, you could probably figure out how it does what it does.

Even if you don't understand these algorithms, just knowing they exist adds to your "roundedness" because when the time comes to deal with, say, high-performance bit counting, you know what to study. In the pre-Google world, it was a lot harder to find out about these things; now it's keystrokes away.

The user that answered your SO question may have seen the problem before or has studied hashing. Write him and ask.

+1 on at least being aware of these things. It is a good thing to know a little about a lot. If people in the industry start talking about stuff like this, you don't want to be the guy in the room who has not the slightest clue what is being discussed.
–
maple_shaft♦Oct 28 '11 at 12:39

From your example there are some things you should absolutely know without really thinking.

1143 i = i - ((i >>> 1) & 0x55555555);

You should recognize the bit pattern 0x555... as an alternating bit pattern 0101 0101 0101 and that the operators are offsetting it by 1 bit (to the right), and that & is a masking operation (and what masking means).

1144 i = (i & 0x33333333) + ((i >>> 2) & 0x33333333);

Again a pattern, this one is 0011 0011 0011. Also that it's shifting two this time and masking again. the shifting and masking is following a pattern you should recognize...

1145 i = (i + (i >>> 4)) & 0x0f0f0f0f;

the pattern solidifies. This time it's 00001111 00001111 and, of course, we're shifting it 4 this time. each time we are shifting by the size of the mask.

1148 return i & 0x3f;

another bit pattern, 3f is a block of zeros followed by a larger block of ones.

All these things should be obvious at a glance if you are "Well Rounded". Even if you don't ever think you will use it, you will probably miss some opportunities to vastly simplify your code if you don't know this.

Even in a higher level language, bit patters are used to store MUCH larger amounts of data in smaller fields. This is why you always see limits of 127/8, 63/4 and 255/6 in games, it's because you have to store so many of these things that without packing the fields you will be forced to use as much as ten times the amount of memory. (Well, the ultimate would be if you needed to store vast numbers of booleans in an array, you could save 32-64 times the amount of memory as you would use if you didn't think about it--most languages implement booleans as a word which will often be 32 bits. Those that don't feel comfortable at this level will resist opportunities to store data like this simply because they are scared of the unknown.

They will also shy away from things like manually parsing packets delivered over the network in a packed format--something that is trivial if you aren't afraid. This could take a game requiring a 1k packet down to requiring 200 bytes, the smaller packet will slide through the network more efficiently and bring down latency and enable higher interaction speeds (which may enable entire new modes of play for a game).

I happened to recognize the code because I've seen it before in software for manipulating video frames. If you regularly worked with things like audio and video CODECs, networking protocols, or chip registers you would see a lot of bitwise operations and it would become second nature to you.

You shouldn't feel bad if your work happens to not coincide with those domains very often. I know bitwise operations well, but I slow way down on the rare occasions I need to write a GUI, because of all the quirks with layouts and weighting and expanding and such that I'm sure are second nature to others. Your strengths are wherever you have the most experience.

the main things you should be aware of is how integers are represented (in general a fixed-length bitvector where the length is platform-dependent) and what operations are available on them

the main arithmetic operations + - * / % can be understood without needing to understand it though it can be handy for micro-optimizations (though most of the time the compiler will be able to take care of that for you)

the bit manipulation set | & ~ ^ << >> >>> require at least a passing understanding to be able use them

however most of the time you will only use them to pass bit flags to a method as ORing together and passing a int and then then ANDing out the settings is more readable than passing several (up to 32) booleans in a long parameter list and allows the possible flags to change without changing the interface

not to mention booleans are generally kept separately in bytes or ints instead of packing them together like the flags does

as for the code snippet it does a parallel count of the bits this allows the algorithm to run in O(log(n)) where n is the number of bits instead of the naive loop which is O(n)

the first step is the hardest to understand but if you start from the setup that it has to replace the bit sequences 0b00 to 0b00, 0b01 to 0b01, 0b10 to 0b01 and 0b11 to 0b10 it becomes easier to follow

so for the first step i - ((i >>> 1) & 0x55555555) if we take i to be equal to 0b00_01_10_11 then the output of this should be 0b00_01_01_10

(note that 0x5 is equal to 0b0101)

iuf we take i = 0b00_01_10_11 this means that 0b00_01_01_10 - (0b00_00_11_01 & 0b01_01_01_01) is 0b00_01_10_11 - 0b00_00_01_01 which in turn becomes 0b00_01_01_10

they could have done (i & 0x55555555) + ((i >>> 1) & 0x55555555) for the same result but this is 1 additional operation

Everyone should understand basic bit-wise operations. It is the composition of the basic operations to perform tasks in an optimized, robust way that takes lots of practice.

Those who work with bit manipulation everyday (like embedded folks) are, of course, going to develop strong intuition and a nice bag of tricks.

How much skill should a programmer who does not do low-level stuff have with bit-wise manipulation? Enough to be able to sit down with a stanza such as you pasted and work through it slowly like it was a brain teaser or puzzle.

By the same token, I'd say that an embedded programmer should understand as much about http as a web dev understands about bit-wise manipulation. In other words it is "OK" to not be wiz at bit manipulation if you're not using it all the time.

Actually in some cases an embedded programmer has to understand more about http than a web developer (I do both). Doing web development, you can usually count on some type of framework. As an embedded developer working with internet-connected devices, I have had to code an http stack from scratch.
–
tcrosleyOct 28 '11 at 16:54

@tcrosely, yes, you're absolutely correct. Perhaps a better example than "http" would have been something like "ORM" or "JEE". The main point is one generally can't have mastery over some subject matter unless they practice it regularly.
–
AngeloOct 28 '11 at 17:07

I agree, and I have never had to deal with either ORM or JEE (just JME back when it was called J2ME).
–
tcrosleyOct 28 '11 at 17:30

The important thing is to know that the obvious algorithm for any task is not necessarily the best. There are a lot of cases where knowing of the existence of an elegant solution to a partucular problem is what's important.

If you went to a decent university you should have been required to take a class in Discrete Mathematics. You would have learned binary, octal, and hexadecimal arithmetic and logic gates.

On that note it is normal to feel confused by that, if it is any consolation to you since I write web applications primarily I rarely need to look at or write code like this, but since I understand binary arithmetic and the behavior of the bitwise operators I can eventually figure out what is going on here given enough time.

As a programmer of mobile phones I had to deal with this sort of thing. It is reasonably common where the device has not much memory, or where transmission speed is important. In both cases, you seek to pack as much information as possible into a few bytes.

I don't recall using bitwise operators in 5 years or so of PHP (maybe that's just me), not in 10 years or so of Windows programming, although some lower level Windows stuff does pack bits.

You say "I cannot help but feel stupid when I look at this". DO NOT - feel angry.

You have just met the output of a cowboy programmer.

Does he know nothing of writing maintainable code? I sincerely hope that he is the one who has to come back to this in a year and try and remember what it means.

I do not know if you cut comments or if there were none, but this code would not pass code review where I was s/w QA manager (and I have been a few times).

Here's a good rule of thumb - the only "naked integers" permissible in code are 0 1nd 1. All other numbers should be #defines, costs, enums, etc, depending on your language.

If those 3 and 0x33333333 has said something like NUM_WIDGET_SHIFT_BITS and WIDGET_READ_MASK would the code have been easier to read.

Shame on whomever put this out in an open source project, but even for personal code comment well and use meaningful defines/enums and have your own coding standards.

I'd consider hex constants to also be permissible. 0xFF00 is much more readable (to me) than 0b1111111100000000. I don't want to have to count to determine the number of bits that have been set.
–
Kevin VermeerOct 28 '11 at 16:14

made perfect sense to me in about as long as would take to dictate the code aloud. The events described in bitCount are immediately clear, but it takes a minute to work out why it actually counts bits. Comments would be great, though, and would make understanding what the code does only slightly harder than the hash problem.

It's important to make the distinction between reading and understanding the code. I can interpret the bitCount code, and read off what it does, but proving why it works or even that it works would take a minute. There's a difference between being able to read code smoothly and being able to grok why the code is the way it is. Some algorithms are simply hard. The what of the hash code made sense, but the comment explained why the what was being done. Don't be discouraged if a function using bitwise operators is hard to understand, they're often used to do tricky mathematical stuff which would be hard no matter the format.

An analogy

I'm used to this stuff. One subject that I'm not used to is regex. I deal with them occasionally on build scripts, but never in daily development work.

I know how to use the following elements of a regex:

[] character classes

The *, ., and + wildcards

The start of string ^ and end of string $

The \d, \w, and \s character classes

The /g flag

This is enough to craft simple queries, and many of the queries I see don't stray far from this.

Anything not on this list, I reach for a cheat sheet. Anything, that is, except {} and () - The cheat sheet won't be enough. I know just enough about these guys to know that I'm going to need a whiteboard, a reference manual, and maybe a coworker. You can pack some crazy algorithms into a few short lines of regex.

To design a regex which requires or suggests anything that's not in my list of known elements, I'm going to list out all the classes of inputs that I expect to recognize and put them in a test suite. I'm going to craft the regex slowly and incrementally, with lots of intermittent steps, and commit these steps to source control and/or leave them in a comment so I can understand what was supposed to happen later when it breaks. If it's in production code, I'm going to make sure that it gets reviewed by someone with more experience.

Is this where you're at with bitwise operators?

So you want to be well rounded?

In my estimation, if you're able to interpret what code like this does by pulling out a piece of paper or going to the whiteboard and running through the operations manually, you qualify as well-rounded. To qualify as a good well-rounded programmer in the area of bitwise operations you should be able to do four things:

Be able to read and write common operations fluidly
For an applications programmer, common operations with bitwise operators include the basic operators of | and & to set and clear flags. This should be easy. You should be able to read and write stuff like

Be able to read more complex operations with some work
Counting bits really fast in O(log(n)) time without branches, ensuring that the number of collisions in hashCodes can differ by a bounded amount, and parsing email addresses, phone numbers, or HTML with a regex are hard problems. It's reasonable for anyone who's not an expert in these areas to reach for the whiteboard, it's unreasonable to be unable to begin working to understand.

Be able to write some complex algorithms with a lot of work
If you're not an expert, you shouldn't expect to be able to do complex and difficult stuff. However, a good programmer should be able to get it done by working at it continuously. Do this enough, and you'll soon be an expert :)

If you want to learn these sorts of micro-optimizations I'd suggest that book; its fun, but unless you are doing very low level bit programming often you probably won't understand it; and most of the time your compiler will be able to do many of these sorts of optimizations for you.

It also helps to rewrite all the hexadecimal numbers in binary to understand these sorts of algorithms and work through them on a test case or two.

Explanation by example. Data are sequences of bits. Lets count the bits on the byte 01001101 having the following operations available:
1. We can check the value of the last bit.
2. We can shift the sequence.

01001101 -> last byte is 1, total=1. shifts

10100110 -> last byte is 0, total=1. shifts

01010011 -> last byte is 1, total=2. shifts

10101001 -> last byte is 1, total=3. shifts

11010100 -> last byte is 0, total=3. shifts

01101010 -> last byte is 0, total=3. shifts

00110101 -> last byte is 1, total=4. shifts

10011010 -> last byte is 0, total=4. shifts

Our answer: 4.

This was not hard, was it? The big deal with bitwise operations is that there are limited things we can do. We cant access a bit directly. But we can, for instance, know the value of the last bit comparing it to the MASK 00000001 and we can make every bit be the last one with shift operations. Of course, the resultant algorithm will look scary for those not used to. Nothing to do with intelligence.

I wouldn't say you need it unless the work you're doing is related to:

Audio processing

Video processing

Graphics

Networking (particularly where packet size is important)

Huge amounts of data

Storing permissions in unix style flags is also another use for it, if you have a particularly complex permissions model for your system, or really want to cram everything into a single byte, at the expense of readability.

Aside from those areas I would count it as a big plus if a developer/senior developer could demonstrate bit shifting, and using | & and ^ as it shows an interest in the profession which you could say leads to more stable and reliable code.

As far as not 'getting' the method on first sight, as mentioned you need an explanation of what it's doing and some background. I wouldn't say it's related to intelligence but how familiar you are with working with hexadecimal on a day-to-day basis and recognising problems certain patterns can solve.