Stuff

I just opened Firefox and got a dialog stating that a software update to 1.5.0.5 had been downloaded and is ready to install. And, of course, I immediately groaned. Why?

Because somewhere between the last two software updates, my arrow keys and page-up/page-down started to intermittently fail. The apostrophe (') would also bring up Find when I typed it in text boxes. I use the keyboard a lot when web browsing, so for me this is REALLY ANNOYING. It got so bad that I was seriously considering switching to Internet Explorer 7 beta, but quickly squashed that idea once I found a viable workaround -- to create a New Window, close it, and click on the page. And even with this, I would still want to go back to 1.5.0.2 if it weren't for the security issues.

If you want to know why people are reluctant to patch, it's simple: patching breaks stuff. Ask anyone who tried Windows NT Service Pack 2 or 4. Nobody wants to keep using broken software, but they'll continue doing so if their workflow is disrupted every time an update is installed. The risk of regressions increases when non-critical changes are included in the patch. For instance, let's take the release notes for 1.5.0.5:

What's new: Improvements to product stability. That's good. Several security fixes -- that's really good. Added changes to Frisian locale (fy-NL)... huh? Why is this in a security update that's being delivered through the automatically-installed-and-tell-later channel? Why couldn't this have waited and is it worth the regression risk?

Now, I can't blame the Mozilla team for accidentally letting a bug through, especially since reproducibility is really bad and it's been sporadically appearing and disappearing according to Bugzilla history. Certainly, making a locale change isn't the worst abuse of a security update that I've seen -- releasing "Windows Genuine Advantage Notifications" as a critical update was a really f#*$&ing stupid idea. Still, when I am asked to download a security update, I want it to hold only security fixes, and software vendors need to recognize that patching involves risk to the user even if it does fix serious security issues.

(And before someone posts a you-should-fix-it-since-it's-open-source comment, I tried. After trawling all over the wiki to get the randomly placed build tools for Win32 that aren't in the source archive, I gave up after I got "nsidl.exe Failed -- Error 57" eight levels deep in recursive calls to "make" within a 200MB source code tree. I can't deal with a build system like that.)

The native debugger in Visual Studio has long had an underadvertised feature called autoexp.dat, which is a file in the PackagesDebugger folder that allows you to control several aspects of the debugger. Among the features that you can control in autoexp.dat include: the string that is displayed for types in the variable panes, which functions the debugger will skip when stepping, and string names for COM HRESULTs. The first is the most interesting and useful, but unfortunately it doesn't support complex expressions. If you try to access more than one field in an expression:

MyType=size=<m_last - m_first, i>

...the debugger simply displays ??? instead. You can get around these limitations by writing an expression evaluator plugin, which can read any process memory, but in the end you're still limited to outputing only a single short string.

In Visual Studio 2005, a powerful feature has been added to autoexp.dat in the form of the [Visualizer] section. This section too contains mappings from types to display form, but these have a whole language for evaluating the object -- and unlike the regular [AutoExpand] templates, you can also affect the contents when the object is expanded. This means you can actually view the contents of an STL set, not just see raw red/black tree nodes.

Now, the [Visualizer] section is undocumented, and there's a comment at the top of the section that says DO NOT MODIFY. For those of you who are new to Windows programming, that means "edit at will." The problem is then deciphering the visualizer language.

Now, I like assembly language, and the asm code is indeed correct compared to the original high-level code. However, it's a perfect example of a bad use of assembly language to optimize. Looking at it, there are a ton of missed opportunities, such as changing the divide to a shift, removing the and operation that's a no-op, and so on. Well, except that the statement immediately before the block is this:

mov seed, 1

Basically, the fragment above computes two constant expressions -- and no, there isn't a branch target in between. And actually, due to a bug, the code wouldn't actually work otherwise. (Hint: Errant mov.) Makes you wonder why the coder didn't just use this:

mov s1, 1 mov s2, 0

The rest of the translated function, by the way, was equally faithful and awful. The worst part was that the original source had a big comment block indicating how the algorithm could be easily sped up by an order of magnitude, which was of course ignored.

If you're going to go to assembly language for speed, your job is to optimize based on knowledge that the compiler lacks, such as restricted values in data, and not to act as a really slow compiler with no optimizer. Merely translating a routine verbatim into asm (and in this case, really bad asm) is a total waste of time.

A few days ago, I finally solved a mystery that had been annoying the heck out of me for years.

Ever since I moved to Windows XP, I had been seeing a weird problem where occasionally, when a program that I had been working on had crashed with an access violation, the Visual C++ 6.0 debugger would stop responding after I dismissed the exception dialog. Soon thereafter, nearly everything else would also lock up, except for the Alt+Tab popup and console windows (particularly command prompts). The GUI programs weren't completely dead, but they ran really slowly, to the point that I could wait over ten minutes just for Visual Studio to redraw. CPU load was not the problem, or else the laptop fans would have gone on. Killing the debuggee process didn't work; the command would go through, but nothing would happen. I only knew three solutions to the problem: spamming Shift+F5 into the debugger (took way too long), killing the debugger process using TASKKILL /F /PID (lost work), and logging off (took too long and lost work). Very frustrating.

At times I thought that DirectShow or Spy++ were to blame, since the problem seemed to occur more often when those were involved... but I couldn't nail anything down. I also thought that it was an issue with Visual C++ 6.0, since it seemed to happen less frequently with Visual Studio .NET 2003, but I had it happen on that version too. I even dragged out the kernel debugger at one point and hard broke into the system when it happened, but couldn't see anything out of the ordinary. So, basically, it was one of those seldomly occurring but intensely annoying bugs that I couldn't resolve.

Then... it happened with Visual Studio 2005. Target process, then debugger, and finally the whole system frozen. What was unusual this time was that the app that broke was HTML Help, since that's what I have VS launch when I compile VirtualDub's help file... and it hadn't crashed! By chance I thought attach NTSD to devenv.exe (ntsd -p -pv <pid>), which worked since NTSD is a console-mode app... and after dumping the thread stacks and running Sysinternals Process Explorer veeeerrrryyy sloooowwly I finally figured out what had been pissing me off all this time.

Now that we've covered averaging bitfields, how to efficiently alpha blend with a factor other than one-half?

Alpha blending is normally done using an operation known as linear interpolation, or lerp:

lerp(a, b, f) = a*(1-f) + b*f = a + (b-a)*f

...where a and b are the values to be blended, and f is a blend factor from 0-1 where 0 gives a and 1 gives b. To blend a packed pixel, you could just expand all channels of the source and destination pixels and do the blend in floating point, but it really hurts to see this, since the code turns out nasty on practically any platform. Unless you've got a platform that gets pixels in and out of a floating-point vector really easily, you should use integer math for fast alpha blending.

If you need to average two sets of bitfields using larger word operations, there is a trick derived from a hardware algorithm for constructing adders that can come in handy. The basic comes from the design of a carry-save adder, which splits apart carry and sum signals to allow multiple values to be added with only one carry propagation pass at the end:

a+b = ((a & b) << 1) + (a ^ b)

(& is the C operator for bitwise AND, ^ is bitwise exclusive OR, and << is a left shift.)

To convert this into an average, propagate through the necessary shift:

(a+b) >> 1 = (a & b) + ((a ^ b) >> 1)

...and to apply this to bitfields, just mask off the least significant bits from each bitfield sum to avoid cross-field contamination:

(pel0+pel1) >> 1 = (pel0 & pel1) + (((pel0 ^ pel1) & 0xfefefefe)>>1)

This can be useful even if you have SIMD hardware that has built-in averaging. For instance, SSE has support for unsigned byte averages, but the above algorithm can be used to average four 565 pixels at a time by using the mask 0xf7def7def7def7de.

Note that the above algorithm always rounds down, which isn't always appropriate; MPEG motion prediction, for instance, requires rounding up. There is a trick that can be used to average up without using any additional operations.