Stuff

I've been doing some work in the video capture routines in VirtualDub lately, and when I do that I tend to just grab the nearest video source as fodder. Usually it's the GameBridge hooked up to a PS2 running a random game, but today I grabbed the webcam, so it's time to take pictures of random stuff around the house.

Take, for instance, this knife sheath I found in the kitchen:

I apologize for the poor quality image — I never said this was a good webcam — but the homemade cardboard sheath says "knife edge here" and there is an NVIDIA logo on it. "GeForce MX" is partly visible and as such I suspect this was the most useful part of the product.

(This is my revenge for the time I got made fun of for using my ATI RADEON X1800 XL, still in its original packaging, as a writing pad.)

...with lots and lots of redundant expressions. I've always wondered why people code like this, because it's an awful practice. Is it concise? No, it's nauseatingly redundant. Is it fast? No, it has lots of redundant operations. My only plausible guess is that it is easier to type somehow because the person who wrote it was adept at cut-and-paste operations, and even in that case it seems sub-optimal, when you could write this instead:

I see this anti-pattern most often in C++, but that's mainly because I look at C++ code most often; I've also seen it in C# and other languages too. It seems widespread enough that I can only conclude that either:

People are lazy, or

It's being taught somehow in some central venue.

The first is a virtual certainty, but the second is something I can't discount, because I've seen some horrid practices taught in avenues of learning before. Academia is notorious for being littered with brilliant researchers who stink at programming, and so perhaps picking on it is a bit too easy; on the other hand, I still have nightmares about a popular C++ data structures textbook in which the author's code called exit() from a constructor to flag an error and had "friend void main();" in a class declaration.

Probably the most irritating issue with this practice, though, is that I've heard it defended with "the optimizer will fix it." Well, readability issues aside, this assumes there is an optimizer at all, which there isn't in a debug build. Without an optimizer, such explicitly redundant code can execute very slowly... to say nothing of being a pain to step through during debugging as well. This also puts unnecessary strain on the optimizer, making it re-deduce simple equivalences over and over when it should be spending its cycles doing more interesting optimizations like virtual method inlining and auto-vectorization.

The third problem is that it assumes the optimizer can remove the redundancy at all. In fact, it's fairly likely that it won't, because optimizers have incomplete information and are necessarily very pessimistic. In the above case, an optimizer has to assume that the return value from GetSomeObject() may change with each call unless it has concrete information indicating otherwise — and given code visibility, aliasing, and polymorphism concerns, this determination frequently fails. If there is any valid way in which the return values from the calls would not be the same, no matter how unusual, weird, or useless the code would be, the optimizer cannot collapse the calls. If it does, that's called bad code generation, and nobody wants that.

Don't be gratuitously redundant. Hoist out common subexpressions to variables, and both humans and compilers will appreciate it.

I'm sick with a cold at the moment, so I'll blog twice today. It'll somewhat make up for my not having blogged the entire month.

Visual C++ has a compiler intrinsic called _ReturnAddress() that gives the return address of the calling function. It exists back to at least VC6, but wasn't documented until VC7. This has become my favorite intrinsic, because you can do lots of evil debugging hacks with it, and it's very fast. There's a known issue with this intrinsic that if the function is inlined the intrinsic returns the return address above the inline site, but I've found that's actually sometimes desirable behavior.

Got a memory leak and the caller didn't provide file/line information when allocating? Overload operator new, call _malloc_dbg() with the return address as the line number, and track down the call site from the leak report. I use this in shutdown of debug builds of VirtualDub, along with DbgHelp, to print names of guilty functions that leaked memory without requiring specially tagged operator new() calls. I also lookup the first four bytes of the memory block to check if it's a virtual method table pointer, which then also gives the type of object leaked.

If you are doing COM-style (AddRef/Release) reference counting, you've probably discovered what a pain it can be to track down a leak from a missing Release() call. It used to be that I'd do this by capturing and manually matching the call stacks of every AddRef() and Release() on a particular class, a very slow and laborious technique. VS2005 makes this a little easier because you can sometimes use tracepoints with $CALLSTACK, but it's still slow. My new favorite technique is to maintain a global map of all AddRef() and Release() callers by _ReturnAddress() and compile call counts by calling site. You can then quickly eliminate the smart pointer code quickly — as long as the totals add up — and then determine which call sites are involved and who screwed up manual reference counting.

My latest exploit involving _ReturnAddress() was to deal with a chatty DLL that was spewing to Win32 debug output, making it unusable for anything I was trying to debug. Solution: Patch the export for OutputDebugString() at runtime and redirect it to a function that used _ReturnAddress() to look up the calling DLL. Now I can filter debug traces by DLL.

_ReturnAddress() quickly becomes inadequate for more complex scenarios because it only gives you one level of stack walk. That's often sufficient, though, and since a real stack walk is so much more expensive (and on x86, unreliable) I find it a useful shortcut.

Not only that Avery, just look at yourself using platform specific DirectX API instead of OpenGL -- how did they manage that?

Uh, well, it shouldn't be too surprising. I'm primarily a Win32/x86 programmer. To some people, it's a minor miracle anytime we write code that is in any way portable and doesn't declare at least one HRESULT variable per function.

In this specific case, it's more of an issue of practicality. I know both, I've used both, sometimes I prefer one over the other. I do like portable and multivendor APIs, but for me, deployment, ease of use, stability, and licensing are also factors.

Truth be told, I do like OpenGL better. It has a real spec, written in a precise manner, and the base API is better thought out than Direct3D, which has been hacked together over the years and still has garbage such as the infamous D3DERR_DEVICELOST. I learned a lot about 3D graphics by reading the OpenGL API and extension specs. It's portable, it makes easy things easy (like drawing one triangle), it's extensible, and it's faster since the application talks directly to the user-space IHV driver. Microsoft tried improving its API with Direct3D 10, but the documentation is still incomplete, it's still non-extensible, and it's even less portable than before since it's Vista-only.

Before anyone says anything else, though, theory and practice are a bit different. In practice, OpenGL drivers tend to have their own nice collections of bugs, and some are better optimized than others. Most games use Direct3D, so a lot of effort has been put into optimizing drivers and hardware for that API. Extensions also mean extension hell, and figuring out which extensions are widely available and work as advertised is just as frustrating as Direct3D caps hell. And then there's the shader issue — whereas Direct3D has standardized assembly and HLSL shader languages, OpenGL has gone through numerous vendor-specific combiner-style, assembly-style, and high-level language shader extensions.

Now, as for how this pertains to what I do, well, VirtualDub isn't exactly a 3D centric application. It does have 3D display paths, and I have dabbled with hardware acceleration for video rendering as well, so here's my VirtualDub-centric take: