Image encoding compression

xiaoxiao2021-03-06  34

Title Limit Challenge: Arithmetic Code Select BLOG Keyword Arithmetic Coding Source from DavidLove Http://www.contextFree.Net/wangyg/tech/benben/chapter/wangyg/tech/benben/chapter4.htm Huffman Coding Using Integer Bit Bits to encode symbols, this Methods The optimal compression effect cannot be obtained in many cases. Assuming that the appearance probability of a character is 80%, the character is actually -log2 (0.8) = 0.322 bit encoding, but the Huffman encoding will be assigned a bit 0 or one bit 1. It is conceivable that 80% of the entire information is almost 3 times the ideal length after compression, and the compression effect is imagined. Is it really only 0.322 0 or 0.322 1? Is it cut with a scissors to cut the binary position in the computer memory? Does the computer have such a specific function? Slow slowly, we don't be confused by the surface phenomenon. In fact, in this issue, we just change the brain, from another angle ... Oops, still don't want, how can it be half? Ok, don't worry, mathematician is just a few years ago, I thought that the magic method of arithmetic coding is, or let us find out which angle is to find a breakthrough. Output: A more amazing thing happens, the arithmetic code is for whole information (no matter how long the information), its output is only one number, and is a binary fraction between 0 and 1. For example, an arithmetic code is 1010001111 for an output of a message, then it represents a decimal 0.1010001111, that is, the decimal number of 0.64. Huh? How to give a half into a binary position, and it will output a decimal, how is the arithmetic code so weird? Don't worry, we will explain the basic principles of the arithmetic coding with a simple example below. In order to expire, we temporarily use the decimal decisions in the decimal representation algorithm, which does not affect the feasibility of the algorithm. Consider the characters that may appear in a message only A b c, we have to compress the saved information as BCCB. Before you start the compression process, suppose we don't know anything about ABC's appearance probability (we use adaptive model), no way, we temporarily think that the appearance probability of the three is equal, that is 1/3, we assigned to three characters according to the proportion of the probability, namely A from 0.0000 to 0.3333, B from 0.3333 to 0.6667, C from 0.6667 to 1.0000. Expressed with graphics is: - 1.0000

|

PC = 1/3 |

|

- 0.6667

|

PB = 1/3 |

|

- 0.3333

|

PA = 1/3 |

|

- 0.0000 Now we get the first character B, let us turn your attention to B. 0.3333 - 0.666. At this time, due to the multi-character B, the probability distribution of the three characters becomes: PA = 1/4, Pb = 2/4, PC = 1/4. Ok, let's divide this zone of 0.3333 - 0.6667 in accordance with the new probability distribution, and the results can be expressed as: - 0.6667

PC = 1/4 |

- 0.5834

|

|

PB = 2/4 ||

|

- 0.4167

PA = 1/4 |

- 0.3333 Then we got the character C, we now pay attention to the interval of C. 0.5834 - 0.6667 from the previous step. After the new addition of C, the probability distribution of three characters becomes PA = 1/5, PB = 2/5, PC = 2/5. We use this probability distribution to divide 0.5834 - 0.6667: - 0.6667

|

PC = 2/5 |

|

- 0.6334

|

PB = 2/5 |

|

- 0.6001

PA = 1/5 |

- 0.5834 Now entering the next character C, the probability distribution of three characters is: PA = 1/6, PB = 2/6, PC = 3/6. Let's divide C's range 0.6334 - 0.6667: - 0.6667

|

|

PC = 3/6 |

|

|

- 0.6501

|

PB = 2/6 |

|

- 0.6390

PA = 1/6 |

- 0.6334 Enter the last character B, because it is the last character, no further division, the interval of B obtained in the previous step is 0.6390 - 0.6501, good, let us choose a easy change in this interval The number of binary, such as 0.64, turning it into a binary 0.1010001111, removed the 0 and decimal points in the front, we can output 1010001111, this is the result of information compressed, we completed the simplest arithmetic compression process . How is it, isn't it difficult? How can I decompress? That is simpler. We still assume that the probability of the three characters is equal to the first distribution map. When decompression, we face the binary stream 1010001111, we first add 0 and decimal points to decimal 0.1010001111, that is, decimal 0.64. At this time, we found that 0.64 In the interval of character b in the distribution, we immediately output character b, and draw three new probability distributions. For the method used in compression, we divide the range of character b in accordance with the new probability distribution. In new divisions, we found that 0.64 fell into the range of characters C, we can output characters C. Similarly, we can continue to output all characters and complete all decompression process (note that in order to describe, we temporarily avoid the problem of judging the end of the decompression, this problem is not difficult to solve). Now let the tutorial, recall carefully until you understand the basic principles of arithmetic compression, and have produced many new problems. Can you get close to the limit? Now you must understand something, you must have a lot of new problems, there is no relationship, let us a solution. First, we have repeatedly emphasized that the arithmetic compression can represent a small binary position, and thus can be close to the lossless compression entropy limit, how to see it from the above description? The arithmetic coding actually uses the neutral idea to represent a small number of binary bits, we really can't accurately represent a single small number of characters, but we can concentrate many characters, only allowed in the last bit of errors. In conjunction with the simple example of the above, we need to adjust the probability distribution table and limit the decisions to be output within a growing range. The limitations of the output interval are the key to the problem. For example, when we enter the first character B, the output interval is limited between 0.3333 - 0.6667, we can't determine whether the output is worth 3, 4, 5 or 6, That is, the coding length of B is less than a decimal position (note that we use the decimal explanation, and the binary is not exactly the same), then we temporarily do not decide any bit of the output information and continue to enter the following characters. Until the third character C, our output interval is limited between 0.6334 - 0.6667, we finally know that the first place (decimal) is 6, but still can't know how much the second is, that is The coding length of the first three characters is between 1 and 2. After we entered all the characters, our output interval was 0.6390 - 0.6501, we have never got the second sense of exact information, now we understand that all 4 characters, we only need 1 point, Our unique choice is to output 2 decimal data 0.64.

In this way, we are quite accurately output in the case where the error does not exceed one decocation, which is well close to the entropy value (which needs to be indicated, in order to better and the following course, the above example is adopted It is the 0-order self-adaptive model, and its results and the true entropy values ​​have a certain gap). How long is the decline? You must have already thought that if the content is particularly rich, the decimal that we have to output will be very long, how can we express such a long decimal in memory? In fact, there is no need to store the entire decimal to output in memory. We can know from above, in the coding, we will constantly get various information about the output decimal. Specifically, when we define the interval between 0.6390 - 0.6501, we already know that the first digit (decimal) must be 6, then we can take 6 from memory, then in the section 0.390 - Continue our compression process between 0.501. There will be no very long decimal existence in memory. The same is true when using binary, we will decide with the compression of the next binary bit 0 or 1, then output the bit and reduce the length of the memory in memory. How is the static model implementation? We know that the above simple example uses an adaptive model, so how to implement a static model? it's actually really easy. For information BCCB, we count only two characters, the probability distribution is Pb = 0.5, and PC = 0.5. We don't have to update this probability distribution during the compression process, each time the division is distributed, and the above example is each flat-divided interval. In this way, our compression process can be simply expressed as the upper limit of the lower limit output interval in the output interval ------------------------------------------------------------------------------------------------------------------------------------------------------ -------------------

Compressed 0.0 1.0 1.0

Enter B 0.0 0.5

Enter C 0.25 0.5

Enter C 0.375 0.5

Enter B 0.375 0.4375 We see that between 0.375 - 0.4375, even a decimal bit is determined, that is, the entire information does not use a decimal bit. If we use binary to represent the above process, we will find that we can very close to the entropy value of this information (some readers may have been counted, the entropy value of this information is 4 binary bits). Why use adaptive models? Since we use static models to closely close the entropy value, why should I use an adaptive model? To know, the static model cannot adapt to the diversity of information, for example, the probability distribution we have in our top is not used on all to compress information. In order to correctly decompress, we must consume a certain space to save static model statistics. The probability distribution, the space used in the save model will reap away from the entropy. Second, the static model needs to count the distribution of information in the information before compression, which will consume a lot of time, making it slower than the slowed arithmetic coding compression. There is also the most important point. For longer information, the symbol probability of the statistical statistics is the probability of the appearance of the symbol throughout the information, and the adaptive model can statistically sign a certain partial existing probability or A sign relative to a certain context is probably, in other words, the probability distribution obtained by the adaptive model will facilitate compression of the information (which can be said to be established in higher probability hierarchical " The total entropy value is smaller), and the compression results obtained based on the context-based adaptive model will far exceed the static model. Adaptive Model Level We usually distinguish between different adaptive models using the term "order". In the example of this chapter, the 0-order adaptive model is used, that is, statistics in this example is the probability of the symbol in the input information, and no context information is considered. If we turn the model into a statistical symbol after the probability of appearance after a particular symbol, the model becomes a 1-order context adaptive model. For example, we have to encode an English text, we have encoded 10,000 English characters, just coded characters are T, the next character to encode is H. We have already counted 113 letters T in the first 1000 characters in the previous encoding process, where 47 T followed by the letter H. We derived the occurrence frequency of the character H at the character t of 47/113, and we use this frequency to encode character H, need -log2 (47/113) = 1.266 bits. Comparison 0-order self-adaptation model, if the number of occurrences of H in the first 1000 characters is 82 times, the probability of character h is 82/10000, we use this probability to encode H, need -log2 (82/10000) = 6.930 Bit. Considering the advantages of contextual factors. We can further expand this advantage, for example, the first two characters to encode characters h are GT, and the h is 80% after the GT in the encoded text, then we only need 0.322 bits to encode output characters. h. At this point, the model we use is called a 2-order context adaptive model. The most ideal situation is to adopt a 3-order adaptive model. At this time, if the arithmetic code is combined, the compression effect of the information will reach an amazing extent. The system space and time that uses higher-order models need to be consumed at least at present, and most of the 2nd or 3-order adaptive models are not available. The role of the escape code Use an arithmetic coding algorithm for an adaptive model to consider how to never appear context encoding.

For example, in the 1-order context model, the context of statistical probability may have 256 * 256 = 65536, because all characters of 0 - 255 may appear after any one in 0 - 255 characters. When we face a context that has never appeared (for example, the character D is coded, the character D is coded, before this, D never appears behind B), how to determine the probability of characters? A relatively simple approach is before the compression start, the number of occurrences of the number of possible context allocation is 1, and if we encounter the never appearing in the compression, we think that the number of times after B is 1, and can be This is the probability of correct encoding. The problem with this method is that the characters in a context have already have a relatively small frequency before compression start. For example, the frequency of the 1-order context model, before compression, the frequency of any characters is artificially set to 1/65536, according to this frequency, each character should be used in 16-bit encoding at the beginning, only with the compression, more Frequent characters take a large space on the frequency distribution map, the compression effect will gradually be better. For the 2nd or 3-order context model, the situation is even worse, we have to waste a lot of space for most contexts that don't appear. We solve this problem by introducing the "escape code". The "escape code" is a special mark mixed in the compressed data stream, which is used to notify the decompression program. The next context has never appeared before, and it is necessary to encode the low-order context. For example, in the 3-order context model, we just encode GHT, the next character to be encoded is A, and before this, the ght has never appeared in character A, at this time, the compressed program outputs an escape code. Then check the second-order context table, see the number of times A appear behind HT; if the HT has appeared A, then use the probability in the 2nd-order context table, otherwise output the escape code, check 1 Order, if you still haven't found it, output the escape code, transfer to the lowest 0-order context sheet, see if you have appeared in characters a; if there is no A, then we turn to a special " Side, the context table, the table contains 0 - 255 all symbols, each symbol counts is 1, and will never be updated, any symbols that have not appeared in the high-order context can be retracted here in 1 / The frequency of 256 is encoded. The introduction of the "escape code" allows us to get rid of the problems that have never appeared, allowing the model to quickly adjust to the optimal location according to the change of input data, and quickly reduce the number of bits needed to high probability symbol coding. The storage space problem is a very difficult problem in the implementation of an arithmetic coded high-order context model. Because we must maintain the count of the context that has appeared, the context species that may occur in the high-order context model is more such that the design of the data structure will directly affect the success of the algorithm implementation. In the 1-order context model, statistics for the number of occurrences in the use of arrays are feasible, but for the 2nd or 3-order context model, the array size will grow in accordance with the index rules, and the memory of existing computers can not meet our requirements. A more intelligent approach is to store all the context that appears in the tree structure. With the high-order context, it is always based on low-order context. We store the 0-order context table in an array, each array element contains pointers that point to the corresponding 1-order context table, 1-order context table It also includes pointers that point to the second-order context table ... thus constitutes the entire context tree.

转载请注明原文地址:https://www.9cbs.com/read-76090.html

New Post(0)