With my recent purchase of a 9000 series nVidia graphics card, I started thinking, has anyone investigated if nVidia's CUDA could be useful for lossless compression? I'm not even remotely close to being a programmer, so I haven't a clue how the code works, but it seems like CUDA is valuable for coding/decoding. I know nVidia is already holding a contest to speed up LAME (which ends in about 2 weeks), so perhaps it could be used to speed up lossless compressors? The fastest modes of several codecs are already blazing fast, approaching the limits of hard drives, but perhaps the high-compression modes could be sped-up through CUDA. Maybe, if the speed-up is enough, developers could even implement more ways to gain compression while still maintaining good encoding rates. It would be pretty cool if compression levels like La's best could be done at 50x or something.

If I'm not mistaken, lossless coding usually employs dictionary methods (like LZW/LZMA) which generate a lot of random access and branching operations.

Not at all!

Most lossless audio compressors use large predictive LPC filters. This would be an operation that is well fit to a GPU, if it weren't for a small detail: because of the need to be LOSSLESS, the operations are often integer, not floating point. It would be possible to do it in floating point also, but then there is a need to have PRECISELY defined operations, rounding, precision. Exactly what GPU's dont have.

Despite all the hype, there aren't that many things GPUs are actually good at.

Less impressive than i hoped to, but this is only initial version, and GPUs grow faster each day.On my GTS 250 it's approximately as fast as my C# encoder (which is fast by the way).FlaCuda -4 achieves the same compression ratio as reference flac -8 (version 1.2.1 on Core 2 Duo@3Gz) at approximately double-triple speed.FlaCuda -8 is as slow as flac -8, but gives an extra 0.5% of compression ratio.Would be nice if someone could thoroughly compare them on a different hardware and post his/her results here.

I'm anxious to see how this would perform on the next generation of NVIDIA hardware (GT300), which is supposedly significantly faster in general computational performance than the previous architecture (G200).

What was the file size for flac -6? We should compare the speed at the same compression ratio, e.g. output file size, not at the same compression level, because e.g. -6 for flac is much lower compression than -6 for flacuda. Please, try to compare flacuda -5 vs flac -8, and compare both execution times and file sizes.

That's a bit too small file for comparison. And it's better to compare against flac -8. Default flac compression level is very fast, i don't think it can be beaten by FlaCuda, at least yet. FlaCuda is focusing on higher compression.

I'm not a developer, so I dunno if possible, but: what about a liboil-like library but for GPGPU encodings, so *any* codec could benefit from GPU computations ?

Not sure. The code i wrote is quite codec specific. The catch is in a relatively slow connection between CPU and GPU. I had to implement practically the whole FLAC algorithm on the device, so that i won't have to transfer intermediate values between host and GPU, only the final result.

FLAC turned out to be very convenient for GPU. Probably the most convenient. One look at e.g. ALAC algorithm was enough to understand it can never get the same benefit.