How do compression software work?

This is a discussion on How do compression software work? within the Tech Board forums, part of the Community Boards category; How do they cram the data into a smaller file without loosing any bytes ? ...or do they loose some ...

How do they cram the data into a smaller file without loosing any bytes ? ...or do they loose some bytes?

It depends on the algorithm: some are lossless, others are lossy.

As a simple example: suppose you wanted to compress a string of ten 'a's. You could express this as a pair: the number 10, and the character 'a'. This is lossless, since given this pair, you can re-construct the string of ten 'a's.

Some are lossy, when the information lost is of low value. Like in a picture. If 3 pixels are very close to the same red value e.g. (0,0,128 0,1,129 0,0,127) there really is very little information that the human eye would percieve in the difference, so to improve compression the algorithm may treat them as being the same value, like 0,0,128. You lost information, but its not noticeable, so the decompressed image woudl look almost exactly like the original.

Until you can build a working general purpose reprogrammable computer out of basic components from radio shack, you are not fit to call yourself a programmer in my presence. This is cwhizard, signing off.

A general approach is to split the data into subsets (like bytes or pixels), sort according to frequency and then encode the more frequent subsets with shorter codes while the less frequent subsets gets longer codes. Check out huffman tables.