(Skip to the update)
Ok, so with some help from Pierre-Louis from Intel I've managed to get it working and do some performance/power tests. But let me start at the beginning: Recently, pulseaudio not only switched to a more power efficient (and otherwise) timing system, as far as I understand a callback API. It also provided the infrastructure to use ALSA devices without causing any interrupts ("period wakeup disabling"), so you CPU can stay longer in standby mode (e.g. "C6 residency"), saving you power and avoiding playback glitches at the same time. See here and here or more background information. With kernel 2.6.38 the first driver (snd-hda-intel) supports this infrastructure out of the box, the snd-hda-intel driver. This combination is what I tested for power efficiency...
And the results are impressive and a bit surprising. All tests were done on my netbook (poulsbo), optimized to reduce wakes. As player I use ogg123 -q, which was much better than e.g. mplayer -quiet in my experience. I'm running Kernel 2.6.39 and I needed to get alsa-lib 1.0.24 and pulseaudio-git and compile them from source, first installing alsa-lib, then compiling and installing pulseaudio. And yes, I've got KDE running in the background, so these measurements would be more exact without it, but I think it's well below a significant margin of error as you'll see.

Below I will show the powertop output in different configurations and then draw conclusions.

Pierre-Louis let me know that with some further tweaking you can get all the way down to < 1 wake/s. I'll have to try that sometime to see how much the power consumption changes! :)

Update 20/04/11
The exact command is "echo 512 | sudo tee /proc/asound/card0/pcm0*p/sub0/prealloc" in my case. And that makes the final difference in combination with paplay or a patched version of pulse that uses the maximum latency by default with all clients. While this breaks some clients's buffer manangement and can crash them, it works quite well with some of them as you can see below. Inside src/pulse/stream.c in currently line 139 in git, search for

then recompile and install. But make a backup of your previos pulse version, because many clients won't be ready for this yet! Of couse you may get some power savings with more compatability as well by just increasing the 250 to a higher value of e.g. 2000 (2 seconds).
Oh and of course you need Alsa 1.0.24 (libs) and a current pulseaudio version, possibly all self-compiled (first Alsa-libs, then pulse).

While the power reduction is not very noticeable here (just 0.1 W) significant at 0.4 W though it will be less with video playing or other activities on the side (TODO: test with in combination vaapi acceleration). But the improvement is very measurable. The audio device interrupts (100 per second) completely disappeared due to pulse. The total wakes went 150 down to 19 (-88%, not even twice the idle value), only 0.8 seem to be directly related to pulse and C6 residency increased from 7 ms to over 71 ms. Quite a success! Congrats to Pierre-Loius for writing the necessary patches and Lennart for writing pulseaudio, which made this possible in the first place.

Update
I've uploaded a pre-packaged .deb version of a long latency patched pulseaudio (x86 Ubuntu 9.10+), you can try it on newer Ubuntu versions as well. (If you want to compile from source, e.g. because the package didn't work, see above for the line you have to change - it's not worth creating a patch for. You also need a kernel 2.6.38+. But it's strictly for testing! It's a randomly timed development source build that may eat your machine alive! ;)

If you like this post, share it and subscribe to the RSS feed so you don't miss the next one. In any case, check the related posts section below. (Because maybe I'm just having a really bad day and normally I write much more interesting articles about theses subjects! Or maybe you'll only understand what I meant here once you've read all my other posts on the topic. ;) )

Actually the second link describes quite clearly how the latencies are affected. It depends on the audio client to pulse, but even low latency response and low power sound card delivery are not totally exclusive - the client can discard the buffer while it's in queue to reduce response times.