How about a basic example. When a user logs in, set a bit in a bitset at the location of that user’s ID number. If you have a bitset allocated for each day, you can tell for any given day how many users logged in by looking at the cardinality of the bitset. Want to see if a particular user logged in on a particular day? Just check the location in the bitset for that user’s ID for the day in question for a 1 value. You can also perform more advanced logging, taking the union of multiple sets, or the disunion, to determine various statistics.

The theory behind it is simple and sound. It’s faster than hitting an RDBMS for values that are binary in nature, and the ability to apply basic set theory to your bitsets to analyze your metrics is quite powerful.

I began to use this method and the code examples on the Spool blog to create metrics in a variety of systems, not to mention create silly stuff like prime number tables. It only took a few implementations to realize that the code examples, taken at face value, don’t really work.

The Problem with BitSet.valueOf() and BitSet.toByteArray()

The heart of the problem lies in the output of Java’s default BitSet.valueOf() method. Here is one of the examples on the page:

If you use the Jedis setbit method to set all of your individual bits, then read the entire set out with BitSet.valueOf(), Java reads the bytes as though they were right to left, whereas Redis stores the values in a straight line. Left to right. The bit sex, as it is called, is reversed in this case, and you can’t possibly get an accurate bitset out of Redis if you retrieve it and convert it using plain ol’ BitSet.valueOf(). You have to have a ‘tweener method to flip the bit sex for you.

You might also think, though it isn’t in the examples, that simply performing a BitSet.toByteArray() would create a byte array appropriate for storage in Redis to be read back via redis.getbit(); Not so. Java uses its native byte order for each call. This confuses things greatly, because if you set the bitset using BitSet.toByteArray() and read it back using BitSet.valueOf(), everything looks correct. Try to read a bit out of this array and be prepared for a surprise.

And you can see the gist I created to read bytes out and put bytes in, retaining the integrity both ways, and show how things do and don’t work:

You can run that locally to get something like a code story…

So, whether or not the guys over at Spool have an unfortunately named helper method that looks exactly like the native Java one, or use some other methodology to maintain bit order in their bitsets I can’t say. It goes without saying that you should check and double-check any ol’ method you pull off the streets.