Check code counseling lecture
Old urchin (original)
The binary data passes through the transmission, access, etc., and the error occurs (1 becomes 0 or 0 to 1), which has a problem that the error is found and correcting the error. All methods for resolving such a problem are to add several check (redundant) bits based on the original data (digital bit).
First, code distance
Any two-way (bit) number between any two legal encoding (codewords) in an encoding system calls the code distance of the two codewords, and the minimum distance of any two codewords in the entire encoding system is The code distance of the encoding system.
As shown in Figure 1, the three bits are used to represent eight different information. In this system, the number of bits between the two codes is from 1 to 3, but the minimum is 1, so the code distance of this system is 1. If one or more of the codewords are reversed, the code word cannot be separated from other valid information. For example, if the information 001 is transmitted, 011 is missed, because 011 is still a legal code word in the table, the receiver will still be considered to be correct information.
However, if you use four binary numbers to edit 8 codewords, the minimum distance between the code characters can be increased to 2, as shown in the table of FIG.
Information serial number binary word a2a1a000001001201030114100510161107111
figure 1
Information serial number binary word a3a2a1a00000011001210103001141100501016011071111
figure 2
Note that the eight codewords in Figure 8-2 have the least two bits differences in each other. Therefore, if a number of information is reversed, it becomes a unused codeword, and the receiver can check it out. For example, the information is 1001, misconnected is 1011, the receiver knows a mistake, because 1011 is not a codeword (not in the table). However, errors cannot be corrected. It is assumed that only one digit is wrong, the correct code word can be 1001, 1111, 10011 or 1010. The recipient cannot determine the one of these 4 codewords. It can also be seen that in this system, even (2 or 4) errors cannot be found. In order to enable a system to check and correct an error, the minimum distance between the code must be at least "3". When the minimum distance is 3, you can correct a fault, or you can check two faults, but you can't correct and check two faults at the same time. Further increase in encoding information error correction and error error requires further increasing minimum distance between code. Figure 8-3 of the table summarizes the error correction and error error of the minimum distance from 1 to 7.
The code rate can be observed? Correction 1 2 3 4 5 6 7 0 ??? 0 1 ??? 0 2 or 1 2 plus 1 2 plus 2 3 plus 2 3 plus 3
image 3
The larger the code distance, the stronger the error correction ability, but the larger the data redundancy, that is, the coded efficiency is low. Therefore, the choice of code distance depends on the parameters of a particular system. The designer of the digital system must consider the probability of errors and the minimum error rate that the system can allow. There must be special research to solve these problems.
Second, parity
The parity code is a simple and widely used method for increasing the minimum distance of the binary transmission system. For example, a single parity will make the minimum distance of the code from one to two.
A binary code word, if its symbols are odd 1, it is called a genity. For example, the code word "10110101" has five 1, so this codeword has a generous. Similarly, the even codeword has even several 1. Pay attention to the model two plus all symbols of all symbols, and can be determined by the different or calculations of all symbols. For a n-bit word, the singularity is given by the following formula:
Single = A0 ⊕A1⊕A2 ⊕ ... ⊕⊕?
The parity can be described as: add a check bit to each codeword, use it to constitute a strategy or even check. For example, in Figure 8-2, it is done. It can be seen that additional symbol D2 is simply used to make each word even. Therefore, if a symbol is wrong, it can be distinguished, because the parity check will become ace. Parity coding By increasing a check bit to make 1 number of odd numbers (odd-calibrated) or an even (even check), so that the code distance is 2. Because it is based on the parity of the number of numbers in the encoding, the even bit error cannot be found. In the case of the seven ASCII code (0110000) of the number 0, if the right side of the right is wrong, 0 becomes 1. The receiving end is also considered a legitimate code 0110001 (ASCII code of the number 1). If a odd calibration bit is added to the left, the encoding becomes 10110000. If the first bit is wrong after transmission, the number of 10110001, 1 becomes an even number, which is not a legally qualified code code. However, if there are two (assuming is the first, 2) error, the number of 10110011, 1 is 5 or odd. The receiving end is also considered a legitimate code (the number 3 ASCII code). Therefore, parity cannot be found.
The parity bit can be generated by the hardware circuit (exclusive or door) or software:
Even atening block AN? = A0⊕a1⊕a2⊕ ... ⊕an-1, odor-quasi-block AN? = NOT (A0⊕a1⊕a2 ⊕ ... ⊕an-1).
In a typical system, the parity bit is added to each word by the parity generator before transmission. The numbers in the original information are detected in the receiver. If there is no correct odd, even, this information is calibrated, and this system will throw the wrong word or request retransmission.
In actual work, the coding system of the vertical and accepted statement of the packets is often used in actual work.
Now consider a system, it transmits information of several lengths of M bits. If this information is composed of packets of each group of N information, the different information is also possible, such as the single information, like a single information. One packet of N information in Figure 4 is arranged in a torque form, and a parity bit is selected in the form of a horizontal parity (HP) and longitudinal parity (VP).
M-bit digital horizontal parity n code word
A1A2 ... AM-1AMHP1B1B2 ... BM-1BMHP2C1C2 ... CM-1CMHP3 .................... N1N2 ... NM-1NMHPNVP1VP2 ... VPM-1VPMHPN 1 portrait parity
Figure 4 Packet parity code with a cross-vocal parity
Research FIG. 4 is known that the packet parity code may not only detect a number of errors. And when an isolated error is generated in a given row or column, the error can also be corrected.
In the primary programmer test (early in the programmer test), there is often a topic of a synthetic check. The general solution should be like this: first find a row or a column of known data is complete, determine that the line (or column) is aqi calibration or even check. And assume that the line and column use the same check (this assumption is correct, you can get verification after all do it). Then find only one unknown row or column, determine the unknown in accordance with the quality of the verification, so that the unknown number can be obtained.
[Example] 2001 primary programmer questions
Arranged from 7-bit ASCII coded, coupled with horizontal vertical parity blocks constitutes the following matrix (the last list is a horizontal parity bit, the last behavior vertical parity bit):
Character 7 ASCII code HP30X1X200110Y1100100X31 X41010110Y201X5X61111D100X710X80 = 0x9111x1011 VP00111X111X12, the bits at X1 X2 X3 X4 are __ (36) __;
?? The bits at X5 X6 X7 X8 are ____, respectively.
?? The bits at X9 X10 XI1 X12 are __ (38) __; Y1 and Y2, respectively, __ (39) __ and __ (40) __.
[solution]
Vertical is known from the ASCII code to the 5th column. then:
From the first column, X4 = 0 is known; the level is known from the third line of the third line.
From the second line, X3 = 1 is known; X8 = 0 is known from the 7th column; X12 = 1 is known from the column 8;
From the 7th line, X11 = 1 is known; X10 = 0 is known from the sixth column; X9 = 1 is known from the sixth line; X1 = 1 is known from the second column;
The X2 = 1 is known from the first line; X5 = 1 is known from the third column; X6 = 0 is known from the fourth line;
From the fourth column (or 5th line) X7 = 0; sorted out:
(36) x1x2x3x4 = 1110
(37) x5x6x7x8 = 1000
(38) x9x10x11x12 = 1011
(39) Ascii code 1001001 = 49h by character Y1, Y1 is "i" (by "D" ASCII code is pushed to 1000100 = 44 h)
(40) ASCII code 0110111 = 37h by character Y2, Y2 is "7" (from "3" ASCII code is pushed to 0110011 = 33H)
If you can remember "0" ASCII code is 0110000 = 30h; "A" ASCII code is 1000001 = 41h, it is more convenient to solve it.
Third, Heming verification
We have pointed out that the minimum distance required for correcting the information word is 3. One of the ways to implement this correction is the Haiming code.
Hai Ming code is a multiple (double) parity error system. It encodes information in logical form to detect the wrong and error correction. All transmission codewords used in the Hay Ming Code are composed of the original information and the additional parity block. Each such parity bit is composed at a particular location of the transmission codeword. When implementing is appropriate, this system can separate it from the original information bits, both of the original information bits, or the additional check digits.
The sea code for deriving and using the codeword of the M bit is as follows:
1. Determine the minimum check bitk, record them into D1, D2, ..., DK, each check bit meets different parity requirements.
2, the original information and K calibration blocks together have a new codeword that grows into a M K bit. Select the k check digit (0 or 1) to meet the necessary parity.
3. The K parcel check required for the received information.
4. If all the parity results are correct, it is considered that the information is not wrong.
If there is one or more faults, the error bit is uniquely determined by the results of these inspections.
Number of check digits
One basic consideration of the time to seek Hay Ming is determined to determine the minimum number of check digits required. Considering the length of the length M bits, if k-calibration is attached, the total length of the transmitted is m k. In the receiver, k parity check is performed, each check result or true or false. The result of this parity can be expressed as a k-bit binary, which can determine the most 2K different states. There must be a trial of all the parity tests in these states, which is the correct condition of the determination. Therefore, the remaining (2k-1) state can be used to determine the position of the error. So the next relationship: 2K-1 ≥ M K
Code format
In theory, the checkpoint can be placed anywhere, but the habitat is arranged in the position of 1, 2, 4, 8, ....
Figure 5 shows the distribution of information bits and check digits when M = 4, K = 3.
Code location B1B2B3B4B5B6B7 check bit XX X information bit x xxx composite codeword P1P2D1P3D2D3D4
Figure 5 Positioning of the calibration bit and information bits in Hayming Code
Determination of check digits
K a check bit is determined by parity by parity by the M K bit composite codeword.
Among them: P1 is responsible for checking the No. 1, 3, 5, 7, ... (P1, D1, D2, D4, ...) bits of Haicing code, (including P1 yourself)
P2 is responsible for checking the 2nd, 3rd, 6th, ... (P2, D1, D3, D4, ...) bits of Hai Ming code, (including P2 yourself)
P3 is responsible for checking the 4th, 5th, 7th, ... (P3, D2, D3, D4, ...) bits of Hai Ming code, (including P3 yourself)
For M = 4, k = 3, an e-quarantine example, as long as the type of doubling test is performed. These tests (represented by A, B, C) are performed in the positions of Fig. 6.
Parked Codes List 1234567 A B Point 1234567 A B C x x x x x x x
Figure 6 parity location
Therefore, three types of verifying equations and determine the check digits can be obtained:
A = B1⊕B3⊕B5⊕B7 = 0 get P1 = D1⊕D2⊕D4
B = B2⊕B3⊕B6⊕B7 = 0 get P2 = D1⊕D3⊕D4
C = B4⊕B5⊕B6⊕B7 = 0 get P3 = D2⊕D3⊕D4
If the four-bit information code is 1001, three checkpins P1, P2, and P3 values can be obtained using these three formulas. And Haiming Code, as shown in Figure 7, the full situation of the sea code encoding when the information code is 1001. The Haidan code of all 16 information (D1D2D3D4 = 0000 to 1111) is listed in Fig. 8.
Code Location B1 B2 B3 B4 B5 B6 B7 Code Type P1 P2 D1 P3 D2 D3 D4 Information Code - - 1 - 0 0 1 Check Position 0 0 - 1 - - Encoded Haill Code 0 0 1 1 0 0 1
Figure 7 Hay Ming Code of the Fourth Information Code
P1P2D1P3D2D3D40000000110100101010101000011100110001001011100110000111111100000011001101101001100110111100101010100101101111111
Figure 8 Haying code not encoded
The above is the processing of the sender
In the receiving party, the same parity test is performed on the three verification equations:
A = B1⊕B3⊕B5⊕B7 = 0;
B = b2⊕b3⊕b6⊕b7 = 0;
C = B4⊕B5⊕B5⊕B7 = 0.
If the three verification equations are established, it is equal to 0 on the right side of the equation, then it is not wrong. If it is not established, the equation is not equal to 0, and the description is wrong. From the right side of the three equations, you can judge that error. For example, if the third digit is reversed, then c = 0 (this equation is not b3), A = B = 1 (these two equations have B3). It can constitute a binary CBA, with a least significant bit, and the error position can be simply indicated by the binary CBA = 011. Similarly, if the value on the right side of the three equations is 001, the first bit error will be described. If the value of the three equations is 100, the 4th bit error is explained.
Haiming code should be 3, so you can correct 1 error. The code distance of parity code is 2, can only find 1 error, but not correct (do not know that one wrong). The code distance without a calibration is 1. After any one wrong, it is legal code, so it can't find an error.
This is a classic saying that the Haiming code is 3, you can find 2 bits, or correct 1 wrong. 2K-1 ≥ M K should be satisfied.
However, in the "Computer Composition and Structure" edited by Wang Aiying, Tsinghua (the book has become a domestic authority), a margin code is 4, and it can be found 2, and correct 1 wrong. 2 (k-1) ≥ M K should be satisfied.
Because Wang Aying's book is not very careful in the two concepts (especially the margin code of 3), the transition is very sudden. Some books have not been carefully digested when they simply plagiarize, so there are some concepts. For Haiming Crades with general code distance 3, it should be "2 bits, or correct one wrong", rather than "2 bits, and correct 1 wrong". A similar error occurred in the test.
Fourth, cycle redundancy check code
In serial transfer (disk, communication), a cyclic redundancy check code (CRC) is widely used. The CRC also adds several check codes to the information code to increase the code distance and error correction ability of the entire coding system.
The theory of CRC is very complicated. The general book only introduces a method of calculating the check code after generating polynomial. The error capacity is related to the generation of polynomials, and can only be died according to the conclusions of the book.
The basic principle of the cyclic redundant check code (CRC) is: the check code of the R bit after the K bit information code, the entire encoding length is N bits, so this encoding is also called (N, K) code. For a given (n, k) code, there may be proven to have a polynomial G (x) having a maximum power of N-K = R. The check code of k-bit information can be generated according to G (x), and G (X) is called the generated polynomial of this CRC code.
The specific generation process of the check code is: assuming that the information multiplicit C (X) is transmitted, and C (X) left is removed, and may be represented as c (x) * 2r, so that C (X) is right. The R bit will be empty, which is the location of the check code. The remainder obtained by dividing C (x) * 2R to generate polynomial g (x) is the check code.
Several basic concepts
1, polynomial and binary digital
The number of polynomial and binary has a direct correspondence: the highest bit of the maximum number of binary numbers of X. The following points correspond to the power of the polynomial, there is this power sub-item corresponding to 1, no this power sub-item corresponds to 0. It can be seen that the maximum power of X is R, converted into a corresponding number of binarys with R 1.
Polynomials include generating polynomial g (x) and information polynomial C (x).
If the polynomial is g (x) = x4 x3 x 1, it can be converted to binary digital 11011.
The transmission information bit 1111 can be converted to the data polynomial to c (x) = x3 x2 x 1.
2. Generating a polynomial is an agreement for the acceptor and sender, which is a binary number. This number is always unchanged throughout the transmission.
In the sender, the generated polynomial is used to modify the information, and the check code is generated. In the acceptor utilizes the generated polynomial, the encoded polynomial is used to detect and determine the error location.
The following conditions should be met:
A. The highest level and lowest bit of the polynomial must be 1.
B. When an error occurs when an error occurs when the information (CRC code) occurs, the remainder should not be 0 after being generated.
C, when a different bit is wrong, the remainder should be different.
D. Continue to the remainder is divided, and the remainder cycle should be made.
It is more complicated to reflect these requirements to mathematical relationships. However, it is possible to find a commonly used generated polynomial, which is commonly used, is shown in Figure 9:
NK code distance DG (x) polynomial g (x) 743 x3 x2 11101734 x4 x3 x2 111101734 x4 x2 x 1101115113 x4 x 1100111575 x8 x7 x6 x4 111101000131263 x5 x2 110010131215 x10 x9 x8 x6 x5 x3 1 1110110100163573 x6 x 1 100001163515 x12 x10 x5 x4 x2 1 101000011010110411024 x16 x15 x2 1 11000000000000101
Figure 9 Commonly generated polynomial
3, mode 2]
The mold 2 is similar to the arithmetic division, but the result of each bit is not affected by the other bits, ie the borrow is not borrowed. So actually is different or. Then shift the shift to do the next one of the mold 2. Proceed as follows:
A. Use divisions to do only several of the modified numbers 2 minus 2, no borrowing.
b, divide one bit right, and if the highest bit is 1, the merchant is 1, and the remainder is 2 minus. If the highest bit is 0, the business is 0, the divisor continues to move right.
c, the number of bits that have been restored is less than the division, the remainder is the final number.
[Example] 1111000 divided by 1101:
1011 --- Business
----
1111000 ----- Detailed
1101 ----?
----
010000
1101
----
01010
1101
----
111 ----余
CRC code generation steps
1. Convert the maximum power of X to the corresponding R 1 bit binary number of the maximum power of R.
2, remove the information code to the R bit, quite related to the corresponding information polynomial C (x) * 2R
3, use the generated polynomial (binary number) to mold the information code 2 to obtain the remainder of the R bit.
4. Fire the remainder to the information code left and then empty, get a complete CRC code.
[Example] Assuming that the generated polynomial used is G (x) = x3 x 1.4 bit of the original message to 1010, and request the encoded message.
solution:
1. The generated polynomial g (x) = x3 x 1 is converted into a corresponding binary division 1011.
2, this question is generated in a polynomial (R 1), to turn the original packet C (X) left shift 3 (R) bit into 1010000
3, use the generated number of binary numbers to the left-shifted 4-bit raw packets after the left-shifting:
1001 ------- Commercial
---------------------------------------------------------------------------------------------------------------------------------------
1010000
1011 ---------- divisor
----------------
1000
1011
------------ 011 --------余 number (check digit)
5, encoded message (CRC code):
1010000
?????? 011
------------------
1010011
CRC and error correction
After receiving the CRC code after receiving the CRC code, the mold is divided by the generation polynomial, if the remainder is 0, the code word is correct. If there is any error, the remainder is not 0, and the different bit is wrong, the remaining number is different. It can be proved that the corresponding relationship between the remainder and the error is only related to the code system and generated polynomial, and is independent of the code word (information bit). Figure 10 shows the error mode of g (x) = 1011, c (x) = 1010, changing C (X) (codeword), only changing the contents of the word word in the table, does not change the corresponding relationship between the remainder and the misalignment .
The remaining number of CRC codewords
A7A6A5A4A3A2A1 correct
1010011000 no error
1010010101011011010111101101100000000111100110011 001 010 100 011 110 111 101 1 2 3 4 5 6 7
Figure 10 (7, 4) error mode of the CRC code (G (X) = 1011)
If the loop code has an error, with a G (X) mode 2 will be obtained by the remainder of 0. If you continue to remove the remainder, we will find an interesting result; each remainder will cycle according to the sequence of Figure 10. For example, the first bit error, the remainder will be 001, and then the second remainder is 010, and the second remainder is 100, 0 l1 ..., repeated cycle, this is the origin of the "Cycling Code" name. This is a valuable feature. If we continue to make the remainder of the remainder of the remainder, we will continue to do 2 to the remainder, and the detected check codeword loop is left. Figure 10 illustrates that when the remainder (101) occurs, the misalignment is also moved to the A7 position. The A1 is sent back when it is changed by different or the door. This way we don't have to use the decoding circuit to make each bit of corrective conditions like the sea calibration. When the number of bits is increased, the cyclic code check can effectively reduce the hardware cost, which is the main reason for it is widely used.
Communication and CRCs commonly used in the network
In data communication and network, typically k is quite large, constructed by a thousand or even thousands of data bits, and then generates a check bit of the R bit using a CRC code. It can only detect errors without correcting errors. Generally take R = 16, the standard 16-bit generated polynomial has CRC-16 = X16 X15 X2 1 and CRC-CCITT = X16 X15 X2 1.
In general, the R bit generated a polynomial generated CRC code detects all double faults, odd-digitized and burst lengths than equal to R, burst length (1-2- (r-1)) burst length. The burst length of R 1 (1-2-R) is greater than the bug of R 1. For example, in the case of R = 16 described above, it can detect errors and 99.97% burst length of 17997% of the sudden length of 179.998% of all burst lengths. 17's burst. So the error error of the CRC code is still very strong. Here, the burst error refers to a string of contiguous, and the burst length refers to the length of the first bit of the error (but, there is no necessity in each bit).
[Example 1] Generate polynomial g (x) = x3 x2 1 for a cycle redundancy code (CRC), using this generated redundant bit, and forms a CRC code after the information bits. If the send information bits 1111 and 1100, its CRC code is _a_ and _b_, respectively. For some reason, the receiving end has received a CRC code that can be determined according to some law, such as a codeword_c_, _ d_, and _e_. (1998 test questions 11) for choice A: 1 lllll002 11111013 1111104 1111111b: 1 11001002 11001013 11001104 1100111C ~ E: 1 000002 00011003 0010111 ????? 5 10001106 10011117 10100018 1011000
solution:
A: g (x) = 1101, c (x) = 1111 c (x) * 23 ÷ g (x) = 1111000 ÷ 1101 = 1011 111
The resulting CRC code is 1111111
B: g (x) = 1101, c (x) = 1100 c (x) * 23 ÷ g (x) = 1100000 ÷ 1101 = 1001 101
The resulting CRC code is 1100101
C ~ E:
Use g (x) = 1101 to 1 ~ 8 mode 2 divided: 1 0000000 ÷ 1101 000? 2 1111101 ÷ 1101 001
3 001011 ÷ 1101 000? 4 0011010 ÷ 1101 000? 5 1000110 ÷ 1101 000 000
6 1001111 ÷ 1101 100? 7 1010001 ÷ 1101 000? 8 1011000 ÷ 1101 100
So _c _, _ D_ and _e_'s answer is 2, 6, 8
[Example 2] One type of detection code commonly used in the computer is CRC, ie _a_cod. Use the _b_ operation during the encoding process. Assuming that the generated polynomial used is g (x) = x4 x3 x 1, the original message is 1100101010, then the encoded message is _c_. The CRC code _D_ said it is correct.
It is often used in radio communication. It specifies that the codeword length is 7 bits. And there are always 3 "1". This code is encoded in _e_.
Optional answer:
A: 1 horizontal vertical parity ?????????????????? 2 cycle and ??????????????? ???????? 3 cycle redundancy ???????????????? 4 positive ratio
B: 1 mold 2 division ??????????????????????? 2 fixed point two-way division ????????????????? ?????? 3 two-decimal division ??????????????????????? 4 cyclic shift method
C: 1 1100101010111 ?????????????????????????????????????????
3 110010101011100 ??????????????????????? 4 110010101010101
D: 1 can correct a error ????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ????? 2 can detect all even bit faults
3 You can detect errors that are smaller than the calibration length length ????????????????? 4 can detect errors that are smaller than, equal to the calibration bit length E: 1 3 / 7 ?????? 2 4/7 ?????? 3 log23 / log27 ???? 4 (Log235) / 7
Solution: From the previous discussion of CRCs, it can be obtained:?
A: 3 loop redundancy? B: 1 model 2 division?
? C: g (x) = 11011, c (x) = 11001010101, c (x) * 24 ÷ g (x) = 11001010101010000 ÷ 11011 ÷ 0011 0011
• The resulting CRC code is 2 110010101010011
D: From the previous discussion with the CRC commonly used in communication and the network, it can be obtained: 4 can detect all the errors that are less than, equal to the calibration bit length.
E: The ratio code is called a heavy code, which is the promotion of parity. In the fixed-ratio code, the odd number or even the nature remains unchanged, however, is additional, and the total number of 1 in each word is fixed. The additional calibration bit required by the fixed-ratio code may be more than one, but the single parity will increase more error-in.
The 3rd ratio code is taken in the so-called 7, that is, the entire codeword length is 7 bits, where the number of bits is fixed to 3. All 128 7-bit code (0000000 ~ 111111) is only one of the number of digits 3 is the legal code word. The formula that can be used for the combination is obtained from the number of legal code words: C73 = 7! / (3! * (7-3)!) = 7 * 6 * 5 / (1 * 2 * 3) = 35
Coding efficiency = legal code word required / codeword total bit number = (log235) / 7