I've got a rather stupid idea: did anybody try to use a wavelet/DCT/FFT-based compression to make a "point of reference" for lossless compression?

I mean, you compress high frequencies using MP3-like but NOT psychoacoustic coding, then decode the frame, find differences with original uncompressed frame and compress those differences using some strong compression together with that lossy frame.

I think in any case such compression would be much more effective than La/Monkey's Audio/Optimfrog.

No gain? It's impossible! For wave data frequency representation is much more efficient than byte-by-byte. Probably they used psychoacoustic-based approach in their hybrid compression but it should be not human-ear- but compression-effectiveness-oriented! For example: compress the range of mostly low and mid-range frequencies, they can be compressed using byte-by-byte methods well enough, use DCT or wavelets mostly for high frequencies.

It must give some gain anyway! I compared files compressed using MP3 128 kbps with original using WaveLab's comparison function: differences are rather subtle (at least for 16-bit data).

And what if to use the multi-stage approach: differences are also compressed using wavelets, then unpacked and compared with the original differences, then those differences are compressed using Huffman method (with delta-compression) or/and maybe with some dictionary-based method and etc.

It is becuase psychoacoustic-based lossy compression has been used! It should be optimized to simplify compression of the difference data! It should be not psychoacoustic, it's compressionacoustic.

The major idea of good compression is to predict what byte will be next and use it as the point of reference to reduce data which corrects the predicted value. I think that "bad quality" audio can be used as a point of reference very efficiently.

Anyway audio compression is not my profession, I'm a desktop programmer (mostly image processing and file utilities), I think someone who made some lossless codec would experiment a bit with this approach.

Interesting, does any lossless codec use a dictionary-based compression (i.e. like WinRAR or WinAce)?

Mostly what I meditated on. Though clearly my thought was to research compressionacoustics. After all, all sound is a bunch of sines, it may be really possible to find out which frequenices give minimum differences when extracted, re-created and compared with the original wave. At least in the most cases.

There is a devil called "entropy of the source" - whatever you do with the source, it is mathematically impossible to loselessly code signal in a such way that it require less bits than it's own entropy. Discussion about this is more like inventing a perpetuum mobile or tri-secting the angle, better forget about that

Hybrid loseless codec implemented perfectly will be a percent or few percents less efficient than plain losless due to signalling overhead.

Lossless image coders are moving to using bayesian techniques using multiple predictors, using the history of prediction errors to determine the "weighting" of each predictor (the simplest way is to do straight weighting of the predictions or choose the prediction with the heighest weight and then use traditional entropy coding assuming a laplacian or GG distribution ... but in theory you could compute a compound probability distribution for the actual coefficient and use that for coding, instead of using differential coding).

If I (as a complete layman I should add) wanted to improve lossless audio coding I would go the same way. Use multiple LMS trained IIR predictors with different adaption rates, and maybe a library of pretrained predictors, and use the history of prediction errors of each to weight them.

This process generally gets within 20% larger than the FLAC of the original file. Some samples do REALLY badly (like BigYellow from the current 128bit test - the compressed version is bigger than the original) and I haven't yet found a sample that does better than straight FLAC with this process. Interestingly, compression rates vary wildly with the individual sample and LAME settings (some samples get best compression with LAME --preset insane, others with --preset 64). While this is hardly a scientifically interesting test, it was realtively interesting to write and test.

Ivan Dimkovic is right about information entropy. Lossless audio encoders are a very interesting case here - as they can losslessly compress a sound file into a smaller number of bits than it's information entropy predicts. While this sounds impossible, it's not as some of the information contained in the waveform is actually contained in the decoding algorithm. Thus lossless audio compressors will do very badly with the vast majority of waveforms, but very well with ones that represent sensical audio data.

Mmm, I don't think that you can compress below the entropy. Or I don't know how you compute it. But given the nature of sound, you can predict that the next sample will be somehow like the previous one. That's this difference that needs to be encoded... Of course with pure electronic music things are getting harder since the signal/sound doesn't necessarily have this "sound" nature...

The problem of this approach (compress the difference) is that it's like encoding pure noise. All the "sound" nature of the signal is removed and nothing remains but noise... But when the lossy codecs get closer and closer to the source with a small bitrate, this noise can be compressed to less bits (imagine if the difference is never bigger than 8 bits in a 16 bits source, you already get a 50% compression for free). So IMO this should be investigated a bit further. Even with MP3 if you chose an MP3 at 320kbps and at 64 kbps, you have more bitrate for the lossy codec. That'sd why I think that HE-AAC could be a good candidate in this case. The result is not so far from the original, compared to the bitrate.

Anyone willing to do a test with MP3 and/or Vorbis + FLAC and/or ZIP at different bitrates ?

Or is there any command-line tool to compute the difference of 2 WAV files ? (maybe I should code one)...

Lossless audio encoders are a very interesting case here - as they can losslessly compress a sound file into a smaller number of bits than it's information entropy predicts.

Wrong ! The reason why you can usually compress a sound file, is because PCM coding needs more bits than what the actual entropy of the data would require.

Therefore when you code the data in a more clever way (ie: remove redundant data), you will approach that theoretical limit (which is a hard limit, determined by the real entropy value).

QUOTE

While this sounds impossible, it's not as some of the information contained in the waveform is actually contained in the decoding algorithm.

Wrong.. this is a misconception.Even with a 10GB algorithm which compresses a 1MB sound file, the worst-case compression (with a clever algorithm) would be 1MB + 1 bit, and the best-case compression (with the same algorithm) *could* be 1 bit.

A 1-bit compressed file contains just the information: "YES, it is the exact data which is known to the algorithm". Unfortunately it can only be done with one set of input data. And it will expand all other possibilities by 1 bit (1st bit, which will say "NO, this it something else").

When you take more space for the algorithm you can make it more clever, but still, all data-dependant information will be in the compressed file. Otherwise you cannot unpack the files.

QUOTE

Thus lossless audio compressors will do very badly with the vast majority of waveforms, but very well with ones that represent sensical audio data.

Yes, because:- sensical audio data is redundant (ie: contains correlation)- and: the algorithm is made to take advantage these correlations.

Unfortunately: you cannot make a compression algorithm which is good on any (ie: random) data, so you will expand random data by at least 1 bit, no matter how complex and clever the algorithm is

Edit: By the way, if you could make an algorithm which reduces any data by just 1 bit, you could also use it several times, and therefore compress anything down to zero bits

By the way, if you could make an algorithm which reduces any data by just 1 bit, you could also use it several times, and therefore compress anything down to zero bits

Which would be very cool. I am really looking forward to the new LAME --preset onebit.

QUOTE

Wrong ! The reason why you can usually compress a sound file, is because PCM coding needs more bits than what the actual entropy of the data would require.

I didn't know that and find it very interesting. Do you have any references? I am not doubting you, I would just like to know more.

QUOTE

A 1-bit compressed file contains just the information: "YES, it is the exact data which is known to the algorithm". Unfortunately it can only be done with one set of input data. And it will expand all other possibilities by 1 bit (1st bit, which will say "NO, this it something else").When you take more space for the algorithm you can make it more clever, but still, all data-dependant information will be in the compressed file. Otherwise you cannot unpack the files.

I agree completely. However, your 1 bit files contains, obviously, precisely one bit of information (in the Shannon sense). However the original contained more - which shows that the compressor decreased the information entropy of the original file. This information can't, by definition, dissapear - it's stored in the algorithm itself. There is as much (or more) information in (Algorithm+Compressed File) as in (Original File). If there wasn't then there couldn't be a 1->1 mapping between compressed files and uncompressed files.

Suppose I copy all my MP3s (and OGGs, MPCs, etc) into a big database indexed by their MD5 sum. Then I replace all my MP3 files with 128 bit binary files containing this sum. I have "compressed" my MP3 collection to a couple of hundred kilobytes. However that doesn't mean that they would be any good without the database (part of the algorithm), which runs to tens of gigabytes.

QUOTE

The result is not so far from the original, compared to the bitrate.

The result isn't very far, percuptually, from the original - but that doesn't mean that there isn't still a huge amount of data in the difference file. Some of the frequencies in the original will not reduce substantially in amplitude if the encoded file were subtracted from the original.

The result isn't very far, percuptually, from the original - but that doesn't mean that there isn't still a huge amount of data in the difference file. Some of the frequencies in the original will not reduce substantially in amplitude if the encoded file were subtracted from the original.

I know but IMO it would be nice to make a few tests to ensure this is a complete wrong way.

I agree completely. However, your 1 bit files contains, obviously, precisely one bit of information (in the Shannon sense). However the original contained more - which shows that the compressor decreased the information entropy of the original file. This information can't, by definition, dissapear - it's stored in the algorithm itself. There is as much (or more) information in (Algorithm+Compressed File) as in (Original File). If there wasn't then there couldn't be a 1->1 mapping between compressed files and uncompressed files.

I'm happy to see that someone finally brought up the (Algorithm+Compressed File) problem. There is a theory of compression that goes something like this:

QUOTE

It is impossible to make a compression algorithm that can compress random data where the resulting compressed data plus the size of the algorithm is smaller than the original data.

Given this, how is anything compressed losslessly? Well, most data isn't very random. Images and sounds have distinct patterns that allow them to be compressed by programs that depend on those patterns. If you try to compress data that does not contain those patterns, your resulting file plus the size of the algorithm will always be larger than the original file.

Now, how does that apply to the posters question? Well, if you compress data (audio data in this case) in a lossy manner, then take the difference of the lossy data from the original, the resulting file will be lacking those patterns that are needed for good compression. In fact, the data contained in the diff will be nearly random. So, bad compression.

Found it. Here is a Wikipedia reference to the challenge. And here is the quote from the FAQ:

QUOTE

Steve Tate <srt@cs.unt.edu> suggests a good challenge for programsthat are claimed to compress any data by a significant amount:

Here's a wager for you: First, send me the DEcompression algorithm. Then I will send you a file of whatever size you want, but at least 100k. If you can send me back a compressed version that is even 20% shorter (80k if the input is 100k) I'll send you $100. Of course, the file must be able to be decompressed with the program you previously sent me, and must match exactly my original file. Now what are you going to provide when... er... if you can't demonstrate your compression in such a way?

So far no one has accepted this challenge (for good reasons).

Mike Goldman <whig@by.net> makes another offer:

I will attach a prize of $5,000 to anyone who successfully meets this challenge. First, the contestant will tell me HOW LONG of a data file to generate. Second, I will generate the data file, and send it to the contestant. Last, the contestant will send me a decompressor and a compressed file, which will together total in size less than the original data file, and which will be able to restore the compressed file to the original state.

With this offer, you can tune your algorithm to my data. You tell me the parameters of size in advance. All I get to do is arrange the bits within my file according to the dictates of my whim. As a processing fee, I will require an advance deposit of $100 from any contestant. This deposit is 100% refundable if you meet the challenge.

Now, I am not saying that what Efenstor says is impossible, just not with the tools available. You would need to use an algorithm that produced a file that was very close to the original. This means that any codecs that use any type of "psychoacoustic-based approach" would need to be skipped since the data removed by this approach would create more "random" data for the diff. With an accurate enough file, you could create a diff that was mostly zeros. Then to compress that, you wouldn't want to use a lossless audio codec as those are tuned to audio needs. You would need to use an algorithm tuned to the data produced by these diffs that was mostly zeros.

This MIGHT be an efficient way to do it, but testing would be needed to see for sure, and that is a lot of work to really do it right.

I agree completely. However, your 1 bit files contains, obviously, precisely one bit of information (in the Shannon sense). However the original contained more - which shows that the compressor decreased the information entropy of the original file. This information can't, by definition, dissapear - it's stored in the algorithm itself. There is as much (or more) information in (Algorithm+Compressed File) as in (Original File). If there wasn't then there couldn't be a 1->1 mapping between compressed files and uncompressed files.

Suppose I copy all my MP3s (and OGGs, MPCs, etc) into a big database indexed by their MD5 sum. Then I replace all my MP3 files with 128 bit binary files containing this sum. I have "compressed" my MP3 collection to a couple of hundred kilobytes. However that doesn't mean that they would be any good without the database (part of the algorithm), which runs to tens of gigabytes.

You're right, when we consider specific files (ie: 1000 predefined MP3's), the size of the algorithm should be taken into account.

In the more general sense, ie. when you need to compress any given MP3, you can make a database of all possible MP3's and then, during compression, just store the index to the received file.

Unfortunately if you really have all possible MP3's in the database, then the index size will be equal to the file size -- thus a MD5sum will not be enough to differentiate them

About the entropy: In fact the entropy of a file whose contents can be completely predicted, is zero.

Note: A possibility to compress a MP3 further, would be to build a table of all possible MP3's of a given length (which should produce a fairly big table ). Ok, now there's a 1:1 relationship between indices and elements. Then, eliminate all awful-sounding possibilities from the table. Now you gain a few bits on the indices, which means further lossless compression of MP3's

This would mean, obviously, exponentially large memory and processing requirements. In fact, it would already become infeasible with a 16-byte MP3 -- which doesn't even cover the full header