I stayed a couple of days extra to enjoy Prague, and Holly flew out from Helsinki to revisit Prague where she’s been once before—for our honeymoon.

I dragged my D750 and the two lenses with me and made to sure to take photos (almost) all the time. The gallery contains only a handful of the approximately 1100 raw files. Of those, I selected 11 for this blahg post.

Prague offers a lot to see. The few photos I selected for this blahg post don’t show anywhere near enough of it. There are more photos in the gallery, but even those are merely highlights of what one can see in Prague. If you haven’t been to Prague yet, I highly recommend a trip.

Last month at work I got to try to optimize a function that takes a number and rounds it up to the next power of 2. The previous implementation used a simple loop. I didn’t dive into obscure bit twiddling, but rather used a helper function that is already in the codebase. Yes, I let the compiler do the heavy lifting of turning easy to understand code into good machine code. The x86 binary that gcc 6.3 produced has an interesting idiom, and that’s why I’m writing this entry.

The first 6 instructions contain the prologue and deal with num being zero or one—both cases produce the result 1. The remaining 6 instructions make up the epilogue and are where the calculation happens. I’m going to ignore the first half of the function, since the second half is where the interesting things happen.

First, we get the number of leading zeros in num - 1 and stash the value 32 in a register:

This xors the number of leading zeros (i.e., 0–31) with 31. To decipher what this does, I find it easier to consider the top 27 bits and the bottom 5 bits separately.

operand

binary

0x1f

00000000 00000000 00000000 000

11111

edx

00000000 00000000 00000000 000

xxxxx

The xor of the top bits produces 0 since both the constant 31 and the register containing any of the numbers 0–31 have zeros there.

The xor of the bottom bits negates them since the constant has ones there.

When combined, the xor has the same effect as this C expression:

out = (~in) & 0x1f;

This seems very weird and useless, but it is far from it. It turns out that for inputs 0–31 the above expression is the same as:

out = 31 - in;

I think it is really cool that gcc produced this xor instead of a less optimal multi-instruction version.

The remainder of the disassembly just subtracts and shifts to produce the return value.

Why xor?

I think the reason gcc (and clang for that matter) produce this sort of xor instruction instead of a subtraction is very simple: on x86 the sub instruction’s left hand side and the destination must be the same register. That is, on x86 the sub instruction works as:

x -= y;

Since the destination must be a register, it isn’t possible to express out = 31 - in using just one sub.

Anyway, that’s it for today. I hope you enjoyed this as much as I did.

This is the first of hopefully many posts related to interesting pieces of code I’ve stumbled across in the dovecot repository.

Back in 1999, C99 added the bool type. This is old news. The thing I’ve never seen before is what amounts to:

struct foo {
bool a:1;
bool b:1;
};

Sure, I’ve seen bitfields before—just never with booleans. Since this is C, the obvious thing happens here. The compiler packs the two bool bits into a single byte. In other words, sizeof(struct foo) is 1 (instead of 2 had we not used bitfields).

The compiler emits pretty compact code as well. For example, suppose we have this simple function:

It’s been a hectic couple of weeks, and so this post is a bit delayed. Oh well.

A couple of months ago, I decided that it was time for me to move on work-wise. As a result, four weeks ago, I joined Dovecot Oy (a part of Open-Xchange).

As you may have guessed from the name of the company, I get to spend my time making the Dovecot email server code better, more featureful, and otherwise more excellent. It is certainly a significant but fun change—going from kernel hacking on a fairly unknown operating system to hacking on the world’s most popular IMAP server. Not a day goes by where I’m not surprised just how much functionality is in the Dovecot codebase, or when I get to consult an RFC related to some IMAP extension I didn’t even know existed.

So, with this said, you should expect to see some posts related to Dovecot, Dovecot code, and email in general.

As you may or may not know, Nexenta has a small office in Massachusetts, and so I end up in Lowell a couple of times a week. Lowell is a decent size city with a history of textile production. As a result, the city is peppered with old mills, most of which have been converted to office and apartment buildings and a couple serve as museums.

The Wannalancit mill is one of the mills that ended up turning into an office building. (It is connected to the adjacent Suffolk mill so I often forget that technically they are separate buildings. You’ve been warned.)

The thing that makes this mill more interesting is that the National Park Service maintains an operational water turbine in the basement. The turbine turns a large flywheel (I am guessing it is about 5 m in diameter).

The turbine itself is in the “basement” along with other goodies. The basement is not very well lit, but the D750 performed quite well even at ISO 5000–8000.

The turbine is geared to the flywheel.

Finally, here is the turbine (inside the red-brown metal object in the background) and the governor (green machine in the foreground) controlling the amount of water entering the turbine and therefore the amount of energy getting stored in the flywheel.

The park service shows up in the mornings to “turn-on” the turbine.

A close-up of the governor:

The table on the left was used for repairing of the 2 cm thick leather belts. I got the impression that the four section cabinet housed containers of oil used to lubricate various parts around the mill.

There are more than three times as many photos in the full gallery. Enjoy!

Recently, I blahgd about GCC optimizing code interestingly. There, I mentioned a couple of bugs I’ve stumbled across. I’m going to talk more about them in this post.

ddi_get_time

It all started when I got assigned a bug at work. “The installer hangs while checking available disks.” That’s the extent of the information I was given along with a test system. It didn’t take long to figure that devfsadm -c disk was waiting on a kernel thread that didn’t seem to be making any progress:

The function of interest here is ibdm_ibnex_port_settle, but before I talk about it I need to mention that the ibdm kmod stashes a ddi_get_time timestamp of when the HCA attached. Now, ibdm_ibnex_port_settle calls ibdm_get_waittime to get a delay to feed to cv_reltimedwait. The delay is (more or less) calculated as: ddi_get_time() - hca_attach_time. This works fine as long as ddi_get_time continues incrementing at a constant rate (1 sec/sec).

You may already see where this is going. The problem is that ddi_get_time returns a Unix timestamp based on the current time-of-day clock. If the TOD setting changes for whatever reason (daylight saving time adjustments, NTP, etc.), the value returned by ddi_get_time may change non-monotonically. This makes it unsuitable for calculating timeouts and wait times. Converting ibdm_get_waittime to use a monotonic clock source (like gethrtime or ddi_get_lbolt) fixes this bug. (Illumos bug 4777)

Things get a bit worse. While figuring out what ddi_get_time does, I noticed that the man page actively encouraged developers to use it for timeouts. (Illumos bug 4776)

NANOSEC

While developing the series of fixes mentioned in the previous section, I ran into the fact that NANOSEC was defined as 1000000000. This made it an int — a 32-bit signed integer (on both ILP32 and LP64).

If NANOSEC (defined this way) is used to convert seconds to nanoseconds (by multiplying), the naive approach will fail with quantities larger than 2 seconds. For example (hrtime_t is a 64-bit signed int):

hrtime_t convert(int secs)
{
return (secs * NANOSEC);
}

Since both secs and NANOSEC are integers, the compiler will compute the product and then sign extend the result to 64-bits. If you look around the Illumos codebase, you’ll see plenty of places that cast or use ULL or LL suffix to make the compiler do the right thing. Why not just change the definition of NANOSEC to include a LL suffix releaving the users of this tedious (and error prone!) duty? Well, now you know what Illumos bug 4809 is about. :)

So, I changed the definition and rebuilt everything. Then, using wsdiff (think: recursive diff that understands how to compare ELF files) I found two places where the before and after binaries differed for non-trivial reasons. (I define a trivial reason as “the compiler decided to use registers differently, but the result is the same”.) Each non-trivial difference implies that there was an expression that changed — it used to be busted!

The first difference was in ZFS (Illumos bug 4810). There, spa_async_tasks_pending miscalculated a timeout making the condition always true.

The second difference was in in.mpathd. 4811). This daemon has a utility function to convert a struct timeval into a hrtime_t. You can read more about it in my previous post.

Before the NANOSEC change, I would have needed casts to fix this. With the change in definition, I don’t have to change a thing! And that’s how a one liner closed three bugs at the same time:

In case you missed it, back in mid-2011 I discoveredIllumos and OpenIndiana. At that point, I already missed hacking on the (Linux) kernel. Based on my blahg posts [1,2], it shouldn’t surprise you that it didn’t take long before I wanted to hack on the Illumos kernel…and so I did.

If you ever contributed to an open source project in your free time while employed full-time, you understand that there’s only so much time you can devote to the open source project and therefore there is only so much you can do.

A couple of months ago, I decided to explore the possibility of working full-time on Illumos. There are only a handful of companies that visibly participate in the Illumos ecosystem, but their use of Illumos is pretty varied (from public clouds to virtualized databases to SAN/NAS appliances). As of this past Tuesday (Monday was a holiday), I’m at Nexenta. At least for now, I’m working remotely (from Ann Arbor) with the fine folks in the Lowell office. It feels great to work on open source again.

Recently, I looked into various build system in hopes of finding one that sucks less than the custom built (not by me) one I had the “pleasure” of dealing with for about a month. The requirements were that it had to work on Windows, Linux, and Mac OS X. Of all the possibilities, CMake looked really promising. After successfully convincing the others that CMake was better for the project, most of the code got switched over. I have to say, CMake is nice.

Since then, I have replaced Makefiles in several of my projects (e.g., my blogging system) with CMakeLists.txt. Not only is the console output cleaner, but it does proper dependencies, some sanity checks, and in general simplifies one’s life. I am actually considering switching HVF build system to it.

Very interesting…over the past month, it dawned on me that distance of 500 miles came up a number of times. For example, my commute to work is 500 miles/week. Another example is the distance between Long Island and Ottawa is just under 500 miles. There are others but they are somewhat boring :) The Ottawa number is not something completely random…you see, I’m heading to the airport in little over an hour to fly to Ottawa for the Ottawa Linux Symposium 2007! OLS 2007, here I come!