Simple and fast Hawman encoding (translation)

xiaoxiao2021-03-06  41

Simple and fast Hawman encoding (translation)

http://www.codeproject.com/cpp/huffman_coding.asp

This article describes the easiest and fastest Hafman coding that can be found online. This method does not use any extended dynamic libraries, such as STL or components. Use only simple C functions, such as: MEMSET, MEMMOVE, QSORT, MALLOC, Realloc, and Memcpy.

Therefore, everyone will find that it is easy to understand and even modify this coding.

background

Havman compressed is a lossless compression algorithm, generally used to compress text and program files. Huffman compressed belongs to a variable code length algorithm. It means that individual symbols (eg, characters in text files) replace with a bit sequence of specific lengths. Therefore, there is a symbol having a high frequency in the file, using a short bit sequence, and the symbols that are rarely appearing, use a longer bit sequence.

Encoding

I write this code with a simple C function to make it easy to use anywhere. You can put them in the class or use this function directly. And I use a simple format, just enter the output buffer, not like other articles, enter the output file.

Bool Compresshuffman (byte * psrc, int nsrclen, byte * & pdes, int & ndeslen;

Bool Decompresshuffman (byte * psrc, int nsrclen, byte * & pdes, int & ndeslen);

Important

In order to run it (Huffman.cpp) quickly, I spent a long time. At the same time, I don't use any dynamic libraries, such as STL or MFC. It compresses less than 100 ms (P3 processor, frequency 1G).

The compressed code is very simple. First, 511 Harfmann nodes were first initialized using the ASCII value:

Chuffmannode Nodes [511];

For (int ncount = 0; ncount <256; ncount )

Nodes [ncount] .BYASCII = NCOUNT;

Then, calculate the frequency of each ASCII code in the input buffer data:

For (ncount = 0; ncount

NODES [PSRC [ncount]]. NFREQUENCY ;

Then, sort according to the frequency:

Qsort (Nodes, 256, Sizeof (Chuffmannode), FrequencyCompare;

Now, construct the Hawman tree, get the sequence of sequences corresponding to each ASCII code:

INT nnotcount = gethuffmantree (nodes);

Constructing the Hafman tree is very simple, put all the nodes in a queue, replacing the two frequencies of the lowest nodes with a node, and the frequency of the new node is the sum of the frequency of the two nodes. In this way, the new node is two parent nodes that are replaced nodes. So loop until there is only one node left in the queue (tree root).

// parent node

Pnode = & nodes [nparentnode ];

// POP first child

PNODE-> PLLEFT = PopNode (pnodes, nbacknode--, false);

// POP Second Child

Pnode-> PRight = PopNode (pnodes, nbacknode -, true); // adjust parent of the two point

PNODE-> PLEFT-> PParent = pnode-> pright-> pParent = PNode;

// Adjust Parent Frequency

Pnode-> nfrequency = pnode-> pleft-> nfrequency pnode-> powlish-> nfrequency;

Here I use a good trick to avoid using any queue components. I have previously until the ASCII code is only 256, but I assigned 511 (Chuffmannode Nodes [511]), the first 255 record ASCII code, and the post-255 recorded parent node in the Havman tree. And use only one pointer array (chuffmannode * pnodes [256]) when constructing the tree. Also use two variables to operate the queue index (int nParentNode = nNodeCount; nbacknot = nnotecount -1).

Then, the last step of the compression is to write each ASCII code to the output buffer:

Int ndesindex = 0, ncodelength, dwcode;

// Loop to Write Codes

For (ncount = 0; ncount

{

DWCODE = NODES [PSRC [ncount]]. dwcode;

Ncodelength = nodes [psrc [ncount]]. ncodegrang;

While (ncodegrang "

{

IF (dwcode & 1)

Setbit (pdesptr, ndesindex);

DWCODE >> = 1, NDESINDEX ;

}

}

Note: In the compressed buffer, we must save the nodes of the Hawman tree and the sequence of sequences so that we can rechealize the Hafmanium tree when decompressing (just save the ASCII value and the corresponding bit sequence).

The decompression is more than the constructing the Hafman tree, and the corresponding ASCII code is replaced by each encoding in the input buffer. As long as you remember, the input buffer here is a bitstream containing encoding per ASCII value. Therefore, in order to replace the encoding with an ASCII value, we must search the Hafman tree with a bit until you find a leaf node, then add its ASCII value to the output buffer:

INT NDESINDEX = 0;

While (NDESINDEX

{

PNODE = Proot;

While (PNode-> Pleft)

{

PNODE = GetBit (PSRC, NSRCINDEX)? PNODE-> PLLT: PNODE-> PLEFT;

NSRCINDEX ;

}

PDES [NDESINDEX ] = pnode-> Byascii;

}

Source File:

HUFFMAN.CPP

HUFFMAN.H routine download:

http://www.codeproject.com/cpp/huffman_coding/huffman_src.zip

转载请注明原文地址:https://www.9cbs.com/read-78717.html

New Post(0)