Simple and fast Hawman encoding (translation)
http://www.codeproject.com/cpp/huffman_coding.asp
This article describes the easiest and fastest Hafman coding that can be found online. This method does not use any extended dynamic libraries, such as STL or components. Use only simple C functions, such as: MEMSET, MEMMOVE, QSORT, MALLOC, Realloc, and Memcpy.
Therefore, everyone will find that it is easy to understand and even modify this coding.
background
Havman compressed is a lossless compression algorithm, generally used to compress text and program files. Huffman compressed belongs to a variable code length algorithm. It means that individual symbols (eg, characters in text files) replace with a bit sequence of specific lengths. Therefore, there is a symbol having a high frequency in the file, using a short bit sequence, and the symbols that are rarely appearing, use a longer bit sequence.
Encoding
I write this code with a simple C function to make it easy to use anywhere. You can put them in the class or use this function directly. And I use a simple format, just enter the output buffer, not like other articles, enter the output file.
Bool Compresshuffman (byte * psrc, int nsrclen, byte * & pdes, int & ndeslen;
Bool Decompresshuffman (byte * psrc, int nsrclen, byte * & pdes, int & ndeslen);
Important
In order to run it (Huffman.cpp) quickly, I spent a long time. At the same time, I don't use any dynamic libraries, such as STL or MFC. It compresses less than 100 ms (P3 processor, frequency 1G).
The compressed code is very simple. First, 511 Harfmann nodes were first initialized using the ASCII value:
Chuffmannode Nodes [511];
For (int ncount = 0; ncount <256; ncount )
Nodes [ncount] .BYASCII = NCOUNT;
Then, calculate the frequency of each ASCII code in the input buffer data:
For (ncount = 0; ncount NODES [PSRC [ncount]]. NFREQUENCY ; Then, sort according to the frequency: Qsort (Nodes, 256, Sizeof (Chuffmannode), FrequencyCompare; Now, construct the Hawman tree, get the sequence of sequences corresponding to each ASCII code: INT nnotcount = gethuffmantree (nodes); Constructing the Hafman tree is very simple, put all the nodes in a queue, replacing the two frequencies of the lowest nodes with a node, and the frequency of the new node is the sum of the frequency of the two nodes. In this way, the new node is two parent nodes that are replaced nodes. So loop until there is only one node left in the queue (tree root). // parent node Pnode = & nodes [nparentnode ]; // POP first child PNODE-> PLLEFT = PopNode (pnodes, nbacknode--, false); // POP Second Child Pnode-> PRight = PopNode (pnodes, nbacknode -, true); // adjust parent of the two point PNODE-> PLEFT-> PParent = pnode-> pright-> pParent = PNode; // Adjust Parent Frequency Pnode-> nfrequency = pnode-> pleft-> nfrequency pnode-> powlish-> nfrequency; Here I use a good trick to avoid using any queue components. I have previously until the ASCII code is only 256, but I assigned 511 (Chuffmannode Nodes [511]), the first 255 record ASCII code, and the post-255 recorded parent node in the Havman tree. And use only one pointer array (chuffmannode * pnodes [256]) when constructing the tree. Also use two variables to operate the queue index (int nParentNode = nNodeCount; nbacknot = nnotecount -1). Then, the last step of the compression is to write each ASCII code to the output buffer: Int ndesindex = 0, ncodelength, dwcode; // Loop to Write Codes For (ncount = 0; ncount { DWCODE = NODES [PSRC [ncount]]. dwcode; Ncodelength = nodes [psrc [ncount]]. ncodegrang; While (ncodegrang " { IF (dwcode & 1) Setbit (pdesptr, ndesindex); DWCODE >> = 1, NDESINDEX ; } } Note: In the compressed buffer, we must save the nodes of the Hawman tree and the sequence of sequences so that we can rechealize the Hafmanium tree when decompressing (just save the ASCII value and the corresponding bit sequence). The decompression is more than the constructing the Hafman tree, and the corresponding ASCII code is replaced by each encoding in the input buffer. As long as you remember, the input buffer here is a bitstream containing encoding per ASCII value. Therefore, in order to replace the encoding with an ASCII value, we must search the Hafman tree with a bit until you find a leaf node, then add its ASCII value to the output buffer: INT NDESINDEX = 0; While (NDESINDEX { PNODE = Proot; While (PNode-> Pleft) { PNODE = GetBit (PSRC, NSRCINDEX)? PNODE-> PLLT: PNODE-> PLEFT; NSRCINDEX ; } PDES [NDESINDEX ] = pnode-> Byascii; } Source File: HUFFMAN.CPP HUFFMAN.H routine download: http://www.codeproject.com/cpp/huffman_coding/huffman_src.zip