Compression mechanism

47 views Asked by At

I know that Huffman encoding is a popular technique for file compression, and I know that it works by encoding more frequent characters with shorter bits. The problem is you can only decode that if you have the tree. Do you actually have to send over the tree as well? If so, in what form? Details please.

2

There are 2 answers

0
Mark Adler On

Yes, you have to send a representation of the code first. The Huffman code is made canonical, so that you can just send the number of bits in the code corresponding to each symbol. Then the canonical code can be reconstructed from the lengths at the other end. You never have to send the tree.

The lengths can themselves be compressed as well, for another level of efficiency, as well as complexity. See the deflate specification for an example of how Huffman codes are transmitted efficiently.

2
Thomas Mueller On

On how the Huffman tree is transferred exactly depends on the compression format.

  • Static Huffman encodes the tree. The Deflate algorithm only encodes the number of bits per symbol.

  • For Adaptive Huffman, there is no need to explicitly encode the tree, as the tree is re-built (or just slightly modified) from time to time. The initial tree is then hardcoded.