This section of the archives stores flipcode's complete Developer Toolbox collection,
featuring a variety of mini-articles and source code contributions from our readers.

Huffman Compression
Submitted by

This is a little huffman compressor which I made recently. I was curious on how
compression algorithms work and tried to implement one and the following code
is what resulted. To my big surprise, the algorithm to build the huffman-tree is only
a few lines of understandable code.
Note that the compression algorithms to compress a file are more an example
of how to use the huffman-tree and it's definitely not meant to be an illustration
of brilliant software-design.
Note that this implementation is not optimized and has a few nasty hacks and
neither does it beat the compression ratio of the well-known compressors out there,
but for people who want to take a quick peak how these compression algorithms work
this might provide a nice start.

//-------------------------------------------------------------------------------------------------
class cHuffmanTree
{
// Note that it's up to the client libary to store the tree
// to use it for later uncompression. Also note that using
// the same frequency-table, the tree can be rebuilt, so you
// can either store the tree, or the frequency-table!
private:
cHuffmanNode* root;
std::vector<pair<U8,U32> > ft;

//-------------------------------------------------------------------------------------------------
bool cHuffmanTree::buildTreeDo (const std::vector<pair<U8, U32> > frequency_table)
{
// If there was already a tree, cleanup and build the new one
if (root) { freeTree (); }

U32 cHuffmanTree::getTreeR (std::string& theTree, cHuffmanNode* node)
{
// This function stores reconstruction-data for our tree
// Note that the "1" and "0" do NOT represent huffman-codes
// but it tells if a node is a branch or a leaf
// Note that we don't store frequency data for reconstruction!
if (!node) return 0;

char* cHuffmanTree::getCharacterPath (cHuffmanNode* n, int c)
{
// Get the path through the tree to find the bit-string
// we need to encode / decode a character. Note that
// we are using a static-member ie NOT an example of
// good programming!
staticchar string[256];
strcpy(string, "");
if (gcp(string, n, c))
return string;
elsereturn "";
}

bool cHuffmanTree::write_bit (bool bit, U8& theByte, bool reset)
{
// grow a byte, bit by bit, and then flush it theByte reset will
// initialize the function to default-state. will return true as
// soon as 8 bits were written and it is up to the called to save
// the value in theByte.
// note that if the function returns false, theByte does not
// contain any usefull value
staticlong bit_index = 7;
static U8 byte = 0x00;