CRC32 algorithm learning notes and how to implement in Java (1)

zhaozj2021-02-16  59

One: Description

There are not many detailed introductions on the CRC32 check algorithm on the forum. I saw the Ross N. Williams' article in the past few days, and finally made the CRC32 algorithm to the dragon. I originally translated the original text, but the time is promoted, I have to write some of my studies. Such everyone can understand the main ideas of CRC32 faster. Since the level is limited, everyone will ask you to correct. Originally visited: http://www.repairfaq.org/filipg/link/f_crc_v31.html.

2: Basic concepts and related introductions

2.1 What is CRC

In remote data communication, the data must be verified, and the data must be checked, and the data must be verified. The cyclic redundancy check CRC (CYCLIC Redundancy Check / Cote) is a valid error control method for a transmitted data block. The CRC check is used in a polynomial coding method. The polynomial multiplier operation process is the same as the normal algebraic polynomial. Multi-class add-oriented operations are not in 2 as molds, and they are not admitted, misplaced, misplaced, like logical varying or computational. 2.2 CRC calculation rules CRC addition operation rules: 0 0 = 00 1 = 1 1 0 = 11 1 = 0 (Note: No carry) CRC subtraction rules: 0-0 = 00-1 = 11 -0 = 11-1 = 0CRC multiplication rules: 0 * 0 = 00 * 1 = 01 * 0 = 01 * 1 = 1CrC division operation rules: 1100001010 (Note: We don't care about business.) _______________ 10011) 1101011011000010011, ,,,., .... --- ,. ,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ... 00001., .... 00000., ....-----., .... 00010, .... 00000, ....----- , .... 00101, .... 00000, ....-----, .... 01011 .... 00000 .... ----- .... 10110 .. .10011 ... ----- ... 01010..00000 ..----- .. 10100.10011. -----. 0111000000 ----- 1110 = Yu number 2.3 How to generate a CRC school The code (1) sets G (x) to the W-order, adding W 0 at the end of the data block, so that the data block is m w, the corresponding polynomial is XRM (X); (2) in 2 is mode, used The bit string corresponding to the g (x) is removed from the bit string corresponding to XRM (X), and the remaining number string; (3) is molded in 2 as a mold, from a bit string corresponding to XRM (X). The result is a CRC check code bit string that is sufficiently verified for the data block. 2.4 Maybe we will ask how to choose G (x) can say that G (X) is not a very easy thing. Generally we use a large amount of data, time tested, correct, efficient, generated polynomial. Generally there are these: 16 BITS: (16, 12, 5, 0) [X25 Standard] (16, 15, 2, 0) ["CRC-16"] 32 BITS: (32, 26, 23, 22, 16 , 12, 11, 10, 8, 7, 5, 4, 2, 1, 0) [Ethernet] 3: How to implement the CRC algorithm in software Now our main problem is how to implement CRC check, coding, and decoding. Using hardware implementation is currently impossible, we mainly consider the method of implementing software. The following is a translation of the author's original text: We assume that there is a 4 bits register, by repeated shift and CRC division, the value in the final register is the remainder we claim.

3 2 1 0 bits --- - --- - Pop <- | | | | <----- Augment Message (added 0 Expansion of the original data) - - - - - 1 0 1 1 1 = The poly (Note: The augment message is the message.) Based on this model, we get a simplest Algorithm: Place the value in the register 0. Additional R $ 10 after the original data (and the remaining data) BeGin shifts one bit left in the REGISTER, reads a new data and placed in Register's 0 bit location. IF (if one of the last left shift operation is 1) Register = register xor poly.end The value in the REGISTER is the remainder of the CRC. My study notes: Why do you want this? From the following example: 1100001010_______________10011) 1101011011000010011,,., ....----,,,, ....- "10011,.,, ....0011,.,, .... ----- ,. ,, ....- "00001., .... 00000., ....-----., .... 00010, .... 00000, ....-----, .... 00101, .... 00000, .... We know that the highest bit of G (x) must be 1, and commercial 1 Or commercial 0 is determined by the highest bit of the divided. And we don't care about it, we care about the remainder. For example, G (x) in the above example has 5 bits. We can see that the remainder of each step is applied in fact, is actually obtained by the four-bit four-bit XOR after the highest position of the divisor. What is the highest bit of the divided? We know the reason from the two different remainings of the marks. When the highest bit of the divided is 1, the commercial 1 then got the remainder of the four-bit four-bit XOR after the highest bit in the G (X); if the highest bit is 0, the business 0 then puts the four-fold number of four after the four in g (x) The last four-bit xor gets the remainder, and we have found that this remainder is the four-bit value after the highest digit. That is to say, if the highest bit is 0, it does not need to make an operation of xor. At this moment, we finally know why the model is to build model, and the principle of the algorithm is clear. The following is a translation of the author's original text: this is very inefficient. In order to speed up its speed, we make it more than 4 Bit data at a time. That is, we want the 32 bit of the 32 bit of the 32 BIT. We still assume that there is a 4 "bit" register. However, each of its bits is an 8 bit of bytes.

3 2 1 0 BYTES ---- - - ---- Pop <- | | | | | <----- Augment Message ---- - - -- ---- 1 <------ 32 Bits ------> (implied with a highest bit "1) According to the same principle we can get the following algorithm : While (there is still no processed data) BeGin checks the register header byte and obtains its value of different offsets and the register left to move one byte, and place a new byte of the newly read. Put the value of the Register and Multiplicit and XOR operations end my learning note: But why do you want this? Similarly, we still explain with a simple example: Hypothesis There is such a value: the value in the register: 010011014 Bit should be removed: 1011 Generate polynomial: 10101100top register ---- ------- -1011 01001101 1010 11100 (CRC XOR) ------------- 0001 10101101 First 4 BITS is not 0 explanation does not except: 0001 10101101 1 01011100 (CRC XOR) - ------------ 0000 11110001 ^^^ 1 4 BITS all 0 Description No longer use. What will I do in accordance with the algorithm? 1010 11100 1 01011100 ------------- 1011 10111100 1011 101111001011 01001101 ------------- 0000 11110001 Now we have seen such a fact, then this is like this The results of the work are consistent with the results above. This also illustrates why the algorithm must first take a variety of offset values, and then the reasons for different or operations with Register. In addition, we can also see that each header is corresponding to a value. For example, in the case: 1011, corresponding 01001101. Then, for the 32 BITS CRC header, based on our model. The head 8 bit is 2 ^ 8, that is, 256 values ​​correspond to it. So we can build a table in advance, and then remove the first byte of the input data and then look up the corresponding value from the table. This makes it greatly improved the speed of the encoding.

-- ---- - ---- ----- <| | | | | <---- Augment Message | ---- - - -- ---- | ^ | || xor | || 0 ---- ---- - ---- V ---- -- ---- -- | -- ---- ---- -- | - ---- ---- ---- | - ---- ---- -- | ---- - ---- - --- | ---- ---- ---- ---- -----> ---- - ---- - - ---- ---- -- ---- - ---- - ---- --- - ---- ---- ---- ---- - ---- ---- 255 - ---- ---- -- The following is the translation of the author's original text: the above algorithm can be further optimized to: 1: register left shift one byte, read a new byte from the original data. 2: Use the byte that is removed from the Register as a 32-bit value in the table to locate the Table 3: This value is to the Register. 4: If there is still unprocessed data, return to the first step to continue execution. Use c can be written as this: r = 0; While (LEN -) R = ((r << 8) | P * ) ^ T [(R >> 24) & 0xff]; but this algorithm is targeted The original data that has been expanded with 0 is used. So finally add such a loop to add W 0 to the original data. My study notes: Be careful not to add W 0 when pre-processed, but the loop described in the above algorithm is added. For (i = 0; i > 24) & 0xff]; So W / 4 is because if there is W 0, because we So W / 4 0 bytes in bytes (8 bits). Note that it is not a loop W / 8 times of translation: 1: For W / 4 0 bytes of the tail, in fact, their role is just to ensure that all raw data have been sent to Register, and algorithm deal with. 2: If the initial value in the register is 0, then the start 4 cycle, the role is just the first 4 bytes of the original data into the register. (This should be seen in the generation of Table tables). Even if the initial value of the Register is not 0, the start 4 cycle is just the first 4 bytes of the original data to some constant xor of the Register, and then send it to the register.

3: (a xor b) xor c = a xor (b xor c) Total, the original algorithm can be changed to: --- - ---- ---- -- ---- ---- ---- ---- ---- ---- -- - - ---- --- - ---- ---- -- ---- -- 255 - ---- ---- - Algorithm: 1: Register left one byte, read from the original data into a new byte. Obtain the corresponding value. 3: Put this value to Register 4: If there is still unprocessed data, return to the first step to continue execution. My study notes: I am still not very clear about this algorithm, maybe I am related to the nature of xor, please ask everyone? Thank you. To this, we have basically made it very clear about the principles and ideas of CRC32. Next, I want to focus on the Java language according to the algorithm. CHENSHENG913@yahoo.com.cn

转载请注明原文地址:https://www.9cbs.com/read-16513.html

New Post(0)