Stuff

I've been playing with pitch shifting again. It's not that I've gotten bored with video, but I do like to write little DSP kernels to affect music that I listen to while coding. I could do the same with video, but I don't think I could get any work done with video playing on the side. Especially if I'm reading subtitles.

As I've said before, pitch shifters (err, pitch scalers) can work either in frequency domain or time domain. Well, after I got the frequency-domain shifter working, I got a better idea for doing a time-domain shifter. The main problem in creating a time-domain shifter is the cross-correlation step, where you find a well-matching segment within a window to overlap with the end of the last segment you used. One way to do this is to try to do some sort of hierarchical approximation to zero in on a good match. Since I had just written a radix-4 FFT/IFFT, I got the idea instead of using FFT convolution to efficiently generate cross-correlation results for all positions without approximation. FFT convolution allows you to compute all possible circular convolutions of one signal with another, and by sufficiently zero-padding the two samples and reversing one of them -- or equivalently, using the conjugate of its frequency representation -- you can do a full cross-correlation search in O(N log N) time, and then scan the resultant array for the best match. And it sort of worked.

I was still hearing some artifacts that made the convolution results suspect, so I decided to double-check the results using a brute force routine. This is trying to search for a match for a block of 512 paired stereo samples in a window of 1024 samples, every 2048 samples, at 44KHz. Amusingly enough, it still ran in real-time... using regular x87 scalar floating-point code, in Debug with optimizations disabled, and with the laptop CPU in low-speed mode. So I just listened to some music for a while, confirmed the difference, dumped the results, fixed the minor bug that was causing the problem, and was back to speedy FFT goodness.

The wonderful thing about audio signal processing is that, even at 44KHz, the data rate is so low compared to video that you can get away with writing a truly awful and unoptimized version of an algorithm, and it still has a fairly decent chance of running full-speed on a modern CPU. A few years ago I might have had to wait for a WAV file to be written out, but now CPUs are so much faster that I can just launch the reference algorithm and compare. Cool.

(Read More) to see my answers to the Win32 programmer brain teasers I posed last time. If you haven't seen the questions yet, you may want to look at the original blog post again before you read the spoilers.

As usual, I might make mistakes, so if you think one of my answers is incorrect, feel free to comment.

§¶"If I were king for a day, and I were making a programming test...."

You've probably seen interview tests for programmers before -- stuff like "how do you reverse a linked list," "why are manhole covers round," and "write strlen()." There are good reasons for such questions, even if they've been asked a thousand times before, and amazingly enough, they're still capable of weeding out candidates.

Most of these questions are sissy programming questions, though. I want a real programming test. One that determines if you have truly spent time in the Win32 trenches. One that weeds out all but the real programmers... and makes the rest of the interviewees sweat and cry Uncle.

(This is probably why I don't write tests for a living.)

I've thrown together some questions I think would challenge an average Win32 programmer. It's probably different from what others might consider "guru" questions, but oh well. How many can you answer? I'll post the answers (well, my answers) next time.

I would give a warning about a self-indulgent post ahead, but that is inherent in the term "blog."

A year ago I posted a list of RPGs that I had played or was playing, and thought I might as well update it. Besides, not like I have a place to keep such a list besides on this blog. So, the list:

Phantasy Star 4 (Genesis)

Final Fantasy IV Easytype (SNES; FF2US)

Final Fantasy IV Hardtype (SNES)

Final Fantasy V (SNES)

Final Fantasy VI (SNES; FF3US)

Final Fantasy Mystic Quest (SNES)

Chrono Trigger (SNES)

Secret of Mana 2 (SNES)

Seiken Densetsu 3 (SNES)

Romancing SaGa 3 (SNES)

Magic Knight Rayearth (SNES)

Ranma 1/2 Akanekodan (SNES)

Sailor Moon: Another Story (SNES)

Tenchi Muyo RPG (SNES)

Tales of Phantasia (SNES)

Star Ocean (SNES)

Breath of Fire I (SNES)

Breath of Fire II (SNES and GBA)

Breath of Fire III (PSX)

Final Fantasy VII (PSX)

Final Fantasy VIII (PSX)

Final Fantasy IX (PSX)

Final Fantasy Tactics (PSX)

Final Fantasy X (PS2)

Xenosaga I (PS2)

Xenosaga II (PS2)

Star Ocean: Till the End of Time (PS2)

Treasure of the Rudras (SNES)

La Pucelle Tactics (PS2)

Disgaea: Hour of Darkness (PS2)

Atelier Iris: Eternal Mana (PS2)

The Legend of Dragoon (PSX)

Final Fantasy X-2 (PS2)

Breath of Fire: Dragon Quarter (PS2)

Lord of the Rings: The Third Age (PS2)

Final Fantasy Tactics Advance (GBA)

Radiata Stories (PS2)

Atelier Iris 2: The Azoth of Destiny (PS2)

Makai Kingdom (PS2)

Grandia III (PS2)

Legend: Completed since last time, still in progress, started since last time.

Grandia III is a very good game, with a neat battle system. However, it ranks pretty high up in cheese factor with regard to RPG cliches, and it has a lot of the same voice actors as other games. Great, Violetta sounds a lot like Nel Zelpher and I have Ridley Silverlake for a mage....

I'm embarrassed to say that the main reason I haven't finished The Legend of Dragoon is that I can't find the PS1 memory card that has the save game, which was nearly at the end. In fact, up until a few weeks ago I couldn't find my PS1 at all. I think that when I get around to finishing it, though, I'll just play it on an emulator... much faster, and I can transfer my game to and from the real thing if I need to. I love having a DexDrive.

You might have noticed at this point that I spend a lot of time blogging about issues with Visual Studio. The reason is that I spend a lot of time using it -- both in a professional and hobby capacity. When it works well, I work well, and when it doesn't... well, I'm told that I swear a lot and that a Microsoft Natural Keyboard Pro is not meant to be used with a fist. I broke a plastic keyboard tray in college that way.

For the most part, I'm fairly pleased with Visual Studio 2005. However, there is one major area that still causes me problems, and that is the build system -- specifically, that it doesn't work like Visual C++ 6.0's. It's a lot better than the one from Visual Studio .NET, but still has the following flaws, IMO:

It builds all of the active projects in the solution, not just the startup project. There's a "build only startup project on run" option, but no "build only startup project on build" option. Creating new sets of solution configurations for each startup project seems needlessly redundant.

It doesn't stop when a build error occurs, so it takes longer to run through unless I happen to see the error scrolling by, and then there's lots of spew from failed dependent projects that had no chance of succeeding.

It sometimes continues linking dependent projects over and over until I wipe the exes/libs and force a clean relink. Seems to be a file timestamp problem somewhere... I thought it was just the .manifest files, which is a known problem, but I disabled those, and it seems the .pdbs are sometimes affected too. In my case, this causes one of the tool .exes to rebuild... which then causes another project to recompile its .fx files using that tool, and then recompile and relink the resultant data arrays... which then causes that static lib to be relinked into the main executable... aaahhhh!

That blog entry I wrote on Visual Basic's Dim command? Well, that arose out of me trying to write a VBA macro for VS2005 to fix the first problem. It's been suggested that SolutionBuild.BuildProject() with the active configuration will accomplish "build startup project," but it fails miserably if you have both x86 and x64 configurations -- seems that both configurations get the same name and BuildProject() can't distinguish them. Argh. It looks like to fix either the first or the second problems properly I'll need to write an Add-In, which I'm reluctant to do because (a) this should be built-in, and (b) I still have bad memories from when I tried to fix FastSolutionBuild, which crashes if you have C# projects in a solution. And I have no idea how to fix the third.

Oh well. It's still a lot better than when I was making makefiles by hand for use with Lattice C++ or Watcom C... really sucked when I got the .h dependencies wrong and files didn't build when they should have.

If you downloaded Visual C++ Express 2005 hoping to do some low-level programming, you might have run into the problem that VCExpress doesn't include an assembler. This is a problem if you need to build someone else's code and you don't want to port it to use another assembler. (Not advisable unless you plan to debug it.)

Browsing some random links, I happened upon an entry in the Visual C++ Team Blog -- which noted that the Microsoft developer division snuck out a freely downloadable goody in the past week:

In some programming languages, you must declare a variable by use, with a specific type:

ItemList count;

In other languages, you can declare a variable which can hold any type:

var item = document.getElementById('myForm');

In yet other languages, variables can be implicitly declared on use:

logOutput = DoCommand("somestring")

And then... there's Visual Basic.

Dim output As OutputWindow = env.GetOutputWindow()

Yuuuuuck! How do people stand this??

I don't ordinarily write Visual Basic code, but dug in a little bit in order to write a macro to provide the missing BuildStartupProject command that VS2005 still lacks. I'm not opposed to BASIC, but this has got to be the ugliest syntax hack ever, considering that DIM stands for "dimension" (it was originally used to declare sized arrays).

I've been meaning to put this together for a while, but only now got around to doing so.

Here is a list of all of the issues I have run into when porting VirtualDub from Visual C++ 6.0 to Visual Studio 2005. Where possible, I have annotated the issues with links to pertinent bug reports in Microsoft Connect (some are mine). A few of these issues are due to bugs that are supposed to be fixed in the upcoming Visual Studio 2005 Service Pack 1.

When it comes to installing VS2005, expect to need about 2GB of space. I have VC++ with x64 support installed (~950MB) and most of MSDN (~800MB). I still have VC6 and VS2003 installed side-by-side with no apparent conflicts.

At this point, I've basically stopped using Visual C++ 6.0 except to maintain VirtualDub 1.6.x, as VS2005 is good enough to replace it for 1.7.x and for all the little random projects that I do on the side. Well, that, and I've become addicted to F12 (Go to Definition) and to opening files by filename (Ctrl+D, >open filename).

There seems to be some confusion about whether legacy x87/MMX registers and instructions can be used in 64-bit code that runs on the x64 Edition of Windows. Part of the reason for this confusion is due to some early x64 ABI documentation that was published in the Windows Driver Development Kit (DDK):

The MMX and floating-point stack registers (MM0-MM7/ST0-ST7) are volatile. That is, these legacy floating-point stack registers do not have their state preserved across context switches.

This statement led many to believe (including me) that x87/MMX code could not be used at all in x64 applications, despite it being supported by the CPU in long mode, and even on other operating systems, i.e. Linux. This didn't make sense, given that FXSAVE and FXRSTOR push those registers along with the SSE registers. Empirical testing, however, confirmed that the registers are saved and restored in user mode, and using them seemed to incur no ill effects. Fortunately, this was later cleared up by the SWConventions.doc file in recent Platform SDKs, and finally made official in the x64 calling conventions documentation that shipped with Visual Studio 2005:

The MMX and floating-point stack registers (MM0-MM7/ST0-ST7) are preserved across context switches. There is no explicit calling convention for these registers. The use of these registers is strictly prohibited in kernel mode code.

At this point, I believe there is no harm in using x87 or MMX code on x64. It is annoying to do so, since the compiler neither supports inline assembly nor MMX intrinsics, and so you have to use ML64 and explicit assembly, but it will work. Now, whether you should is a different question. There are some constructs that don't necessarily translate well from MMX to SSE2, and I don't see it as a given that code will run faster when rewritten to use the latter, at least without major redesign. Adapting legacy code is also not trivial since you have to rewrite it for 64-bit pointers and to have the correct function prologues and epilogues. At least, though, it does seem that you have a choice.

When it comes to high-precision timing on Windows, many have gotten used to using the CPU's time stamp counter (TSC). The time stamp counter is a 64-bit counter that was added to most x86 CPUs starting around the Pentium era, and which counts up at the clock rate of the CPU. The TSC is generally readable via the RDTSC instruction from user mode, making it the fastest, easiest, and most precise time base available on modern machines.

Alas, it is rather unsafe to use.

The first problem you quickly run into is that there is no easy way to accurately and reliably determine the clock speed of the CPU, short of perhaps doing calibration over a longish period of time. Sometimes you don't need super accuracy or only need to deal with timing ratios, in which this doesn't matter. However, you're still screwed when you discover that on CPUs with speed switching, the speed at which the TSC counts will change when the CPU speeds up or slows down, which makes the TSC's rate swing all over the place. And if this weren't enough, the TSC is not always synchronized on dual-core or SMP systems, meaning that the reading from the TSC will jump back and forth by as much as 0.2ms as the kernel moves your thread back and forth across the CPUs. Programs which do not have adequate safety protection may be surprised when time momentarily runs backwards.

For reasons like these, Microsoft now recommends that you use QueryPerformanceCounter() to do high-precision timing. What they don't tell you, though, is that QPC() is equally broken.

The documentation for QueryPerformanceFrequency() says that not all systems have a high-performance counter. Truth be told, I've never seen a system that didn't support QPF/QPC, including ones running Windows 98, NT4, and XP. However, the timer that is used can vary widely between systems. On Win9x systems that I've seen, QPF() returns 1193181 -- which looks suspiciously like the clock rate of the venerable 8253/8254 timer. On a PIII-based Windows 2000 system, I got 3549545, which happens to be the frequency of the NTSC color subcarrier, but is probably just a factor of a common clock crystal used by some chipset timer. And I've also seen the CPU clock speed show up, or CPU clock divided by 3.

Some of these timers used for QPC also have bugs.

When I was looking at some anomalous capture logs from one of my systems, I noticed that the global_clock values from the capture subsystem, which were recorded in the capture log, occasionally jumped forward or backward by a few seconds compared to the video capture clock. (While video capture drivers are notoriously flaky, there were no gaps in the video and I'm pretty sure my PlayStation 2 didn't burp for three seconds.) When I tried Windows XP x64 Edition, the HAL used the CPU TSC for QueryPerformanceCounter() without realizing that Cool & Quiet would cause it to run at half normal speed. And recently, I've had the pleasure of seeing a dual-core system where use of the TSC exposed QPC-based programs to the same CPU-mismatch bug that RDTSC incurred. So, realistically, using QPC() actually exposes you to all of the existing problems of the time stamp counter AND some other bugs.

So, what to do?

I switched VirtualDub's capture subsystem from QueryPerformanceCounter() to timeGetTime(). I had to give up microsecond precision for only millisecond, but it's more reliable. If you don't really need high precision, you can use GetTickCount(), which has terrible precision on Win9x (55ms), but it's reliable, and it's fast, since it just reads a counter in memory. If you're a user suffering from this problem, you can try fixing the problem by adding /usepmtimer to the end of the boot.ini entry, which switches QPC() to use an alternate timer (usual disclaimers apply; back up data before trying; no purchase necessary; void where prohibited).

I figured out why VirtualDub causes Aero Glass to go nuts on Vista beta 2, and the actual reason turned out to be rather bizarre. Had nothing to do with window hierarchy layout.

Aero Glass uses the Desktop Window Manager (DWM) to composite the display, so certain drawing operations from programs that the DWM can't emulate will cause the DWM to disable composition until the program exits. Typical operations that kill the DWM are those that access the full display or those that appear to do so (that the operation is limited to a window can't be determined). Well, I'm not locking the backbuffer or doing anything naughty, but it turns out this was what was disabling the DWM:

Merely calling GetSurfaceDesc() on the primary surface causes the DWM to disable itself. Changing this to use GetDisplayMode() fixed the problem, and it turns out that even good old DirectDraw 3 will work just fine with Aero Glass as long as you use a clipper when you blit to the primary. If you don't use a clipper, bye bye DWM.

Another oddity of beta 2 is that it seems you can create a DirectDraw overlay surface, but it never shows up and you see nice color key green instead. It's not a problem with the color key, because I can disable it and I still won't see an overlay. Sigh.