1.3 representation and storage of information in the computer
The basic function of the computer is to operate and processing data. There are two types of data, one is numerical data, such as 3.1416, -2.71828 ..., the other is non-numeric data (information), such as A, B, , = .... No matter which data is represented by binary digital in the computer. Only binary values in the computer, all symbols are represented by binary numerical code, the number of numbers, and negative negative numbers are also represented by binary code. The highest digit of the value uses "0", "1", respectively, indicates the number of positive and negative numbers, respectively. A number of representations (along with symbol) in the computer, called machine numbers, numerical processing adopts binary operation, non-numerical processing adopts binary encoding, which has the advantages of simple operation, convenient circuit implementation, low cost cost.
1.3.1 inlet count
An inlet count includes a set of digital symbols and two basic factors:
● Digital a set of symbols used to represent some model. Such as: 1, 2, 3, a, b.
● The number of digital numbers used in the base system, represented by R, called R, and its carry regularity is "Run one". Such as: the decimal base is 10, all 10 into 1.
● The weight of the right time is in different locations. In a certain part system, at different digits, representing different values, a number of digits is a fixed constant of this location by the value of this number, and this fixed constant is called "bit". Such as: the weight of the decimal position is "1", and the weight of the 100th is "100".
First, decimal
Decimal number, its digital is expressed in 10 different numbers 0, 1, ... 8, and 9. Since it has 10 digital, the base is 10. Digital is different from different positions, such as 4 × 102 = 400 in this number of 3468.795, and 10N is referred to as a bit right, referred to as "right", and decimal numbers can be represented "Expand the polynomial. For example: 3468.795 = 3 × 103 4 × 102 6 × 101 8 × 100 7 × 10-1 9 × 10-2 5 × 10-3
The rules of the decimal number are: all 10 into 1.
Second, binary
The data in the computer is stored in binary form, and the number of binary numbers is represented by 0 and 1. The binary base is 2, the right 2N, the calculation rules of the binary number are: all 2 into 1.
For a binary number, it is also possible to represent a polynomial that is offset. E.g:
10110.101 = 1 × 24 0 × 23 1 × 22 1 × 21 0 × 20 1 × 2-1 0 × 2-2 1 × 2-3
Third, eight into system and hex
The number of octaves is represented by 0, 1, ... 6, 7. Octa counting is 8, the right to 8N, the calculation rules of eight into the number is: all 8 into 1.
The number of hexadecimal numbers is represented by 0, 1, ... 9, A, B, C, D, E, and F. The basis of the hexadecimal number is 16, the right to 16N, the rules of the hexadecimal number are: all 16 into 1.
Where the symbol A corresponds to 10, b represents 11, ..., and f represent 15 in the decimal system.
When writing, the following three formats are available:
Such as the first: 111 01101 (2), 331 (8), 35.81 (10), FA5 (16)
As the second: (10110.011) 2, (755) 8, (139) 10, (AD6) 16, such as third: 10101.001B 761O 3762D 2CE6H
The letters B, O, D, and H represent binary, octal, decimal, and hexadecimal.
1.3.2 Conversion between the number system
First, the number of binary numbers, the number of eight into the number, and the number of hexadecimals is converted to the decimal number.
The number of various kinds of enrolled the number of declarations was taken after the decision.
Example 1.1 convert the binary number (1011.101) 2 into the equivalent decimal number.
(1011.101) 2 = 1 × 23 0 × 22 1 × 21 1 × 20 1 × 2-1 0 × 2-2 1 × 2-3
= 8 0 2 1 1/2 0 1/8
= (11.625) 10
The eight-input and hexadecimal numbers can be converted into a decimal number in position.
Example 1.2 (2576) 8, (3d.b) 16 is converted into a decimal number.
(2576) 8 = 2 × 83 5 × 82 7 × 81 6 × 80 = (1406) 10
(3D.B) 16 = 3 × 161 13 × 160 11 × 16-1 = (61.6875) 10
Second, the decimal number is converted to a binary number
For the integer part of the decimal number and the decimal portion, different calculations are required in the conversion, respectively, respectively, and then combined.
1. Extremely converted to binary count (except for 2 extravagance method)
METHODS: Differentiated by second, each time the remainder is a digital number of integer parts of the binary, until the business is 0.
2. Decimal pure decimal conversion is a binary number (multiplication method)
Method: Multiply by one by two, the number of integer parts of each product is the number of binary counts.
Example 1.3 converts the decimal number 69.8125 into a binary number.
Figure 1.4 Decimal number conversion is a binary number
Conversion to integer portion 69, as shown in Figure 1.4, obtained: (69) 10 = (1000101) 2
Converting the decimal 0.8125 into a binary decimal, as shown in Figure 1.4, obtained: (0.8125) 10 = (0.1101) 2
Therefore, 69.8125d = 10001.1101b
The decimal number of converted into an octave and hexadecimal number can also be carried out by the above method.
3. Differential numbers and eight-input transition
(1) The number of binary is converted into an octave
The method of converting the binary converted into an eight-input number is to divide the binary number from the decimal point to the left (integer portion) and the right (fractional portion), each 3-digit digital, converted into a number in the octal digital, connected . When less than 3 digits, 0 replenishment 3 bits were used for the original value.
Example 1.4 convert the binary (11110010.1110011) 2 into an eight-input number.
Binary 3 packets: 011 110 010. 111 001 100 Convert to an octal number: 3 6 2. 7 1 4
(11110010.1110011) 2 = (362.714) 8 (2) Octa-Bigenic number conversion to binary number
The octal number of converted into a binary number is to write each binary number to the corresponding 3-bit binary number, and then arrange it in order.
Example 1.5 converts an octave (2376.14) 8 into a binary number.
Octa 1 bits 2 3 7 6. 1 4 binary 3 010 011 111 110. 001 100
(2376.14) 8 = (10011111110.0011) 2
4. Differential numbers and hexadecimal conversion
The two-to-decimal number of conversion methods of the hexadecimal number: and the binary number are similar to the conversion method of the eight-input number. It is a set of 4-bit binary digital to a set of 1-bit hexadecimal number, and the number of hexadecimal and binary The number of conversions correspond to the 1 bit of the hexadecimal number of 4 digits of the binary number, and then arrange it in order.
Example 1.6 converts the binary (110101011101001.011) 2 to a hexadecimal number.
Binary 4-bit group: 0110 1010 1110 1001. 0110 Convert to hexadecimal number: 6 a e 9. 6
(110101011101001.011) 2 = (6AE9.6) 16
Here we see the conversion between binary and octal, and the conversion between hexadecimal is very intuitive. Therefore, it is necessary to convert a decimal conversion into a binary number, and then convert to an eight-input number or hexadecimal number, and then quickly convert Become binary number.
Similarly, to convert the decimal number in the conversion into an eight-feed number and a hexadecimal number. It is also possible to convert the decimal number into a binary number, and then converted to an eight-input number or a hexadecimal number, as shown in Table 1.1 is a common counting comparison table.
For example, the decimal number 673 is converted into a binary number, and can be converted to an octal number (divided by 8 out of the method) to obtain 1241, and then turn it into 3 bits of two into the eight-digit number, and obtain 1010100001B, such as further conversion It can get 2A1H soon with 4-bit groups.
Table 1.1 Common count control comparison table
Decimal number of binary counts of eight into digital 0 0 0 0 0 1 1 1 2 10 2 2 3 11 3 3 4 100 4 4 5 101 5 5 6 110 6 6 7 111 7 7 8 1000 10 8 9 1001 11 9 1010 12 A 11 1011 13 B 12 1100 14 C 13 1101 15 D 14 1110 16 E 15 1111 17 F 16 10000 20 10 ... ... ...
1.3.3 Operation of the number of binary
In the computer, the binary number can be arithmetic operation and logical operation.
First, arithmetic operation
Add: 0 0 = 0 1 0 = 0 1 = 1 1 1 = 10
Reduction: 0-0 = 0 10-1 = 1 1-0 = 1 1-1 = 0
Multiplication: 0 × 0 = 0 0 × 1 = 1 × 0 = 0 1 × 1 = 1
Separation: 0/1 = 0 1/1 = 1
Second, logical operations
1. Or: "∨", " "
0 ∨0 = 0 0∨1 = 1 1∨0 = 1 1∨1 = 1
Or the operation, when two logical values are one to 1, the result is 1, otherwise 0.
Example 1.7 To determine if the score X is in a score section of less than 60 or the grade Y is in a score of greater than 95, it can be said: (x <60) ∨ (Y> 95) If x = 70, y = 88, this When X <60 and Y> 95 are not satisfied, the results of the two expressions are 0, "∨" operation results are 0.
If x is less than 60, the expression of x <60 satisfies is 1, and regardless of Y, the "∨" operation result is 1.
Similarly, as long as Y is greater than 95, regardless of the X-made value, the "∨" operation result is 1.
If x = 50, y = 98, then x <60 satisfies (1), Y> 95 is also satisfied (1), "∨" operation result is 1.
2. And: "∧", "·"
0 ∧0 = 0 0∧1 = 0 1 ∧0 = 0 1∧1 = 1
In the operation, when the two logical values are 1, the result is 1, otherwise 0.
Example 1.8 The standard for qualified products should be controlled between 205 and 380, to determine if a product quality parameter X is qualified, which can be used: (x> 205) ∧ (x <380)
When the value of X is not within the interval, at least one of the X> 205 and x <380 conditions is not satisfied ("∧" calculation rules), "∧" results are 0, the product is unqualified. .
When the value of X is in this section, X> 205 and x <380 conditions are satisfied simultaneously to 1, "∧" results are 1.
3. non"
In non-calculated, the logical value of each bit is reversed.
rule:
Example 1.9
Example 1.10 If you use 1 to express gender, it means a woman.
4. Different or: ""
0 0 = 0 0 1 = 1 1 0 = 1 1 1 = 0
In the same time, when the two logical values are different, the result is 1, otherwise 0.
1.3.4 representation of non-numerical information
First, ASCII code
AMERICAN Standard Code for Information Interchange is a referred to as the US information exchange standard code. The ASCII code accounts for one byte, with two types of ASCII codes and 8 ASCII code, and the 7-bit ASCII code is called standard ASCII code, and the 8-bit ASCII code is called the expansion ASCII code. The 7-digit binary gave 128 different combinations, indicating 128 different characters. Among them, 95 characters can be displayed, including uppercase English letters, numbers, arithmetic symbols, punctuation, and the like. In addition, 33 characters are not displayed, they are control code, the encoded value is 0 ~ 31 and 127. For example, Enter (CR), encoded is 13, such as Table 1.2 is an ASCII code character coding table.
Table 1.2 ASCII code character coding table
B6 B5 B4 B3 B2 B1 B0 000 001 010 011 100 101 110 111 0 0 0 0 NUL DLE SP 0 @ P `P 0 0 1 SOH DC1! 1 AQ AQ 0 0 1 0 STX DC2" 2 Br Br 0 0 1 1 ETX DC3 # 3 CS CS 0 1 0 0 EOT DC4 $ 4 DT DT 0 1 0 1 ENQ NAK% 5 EU EU 0 1 1 0 ACK SYN & 6 FV FV 0 1 1 1 Bel ETB '7 GW GW 1 0 0 0 BS CAN (8 HX HX 1 0 0 1 HT EM) 9 IY IY 1 0 1 0 LF SUB *: JZ JZ 1 0 1 1 VT ESC ; K [K {1 1 0 0 FF FS,
The BCD code represents a decimal number with a 4-bit binary number, for example: BCD code 1000 0010 0110 1001 Press the 4-bit set of respective conversion, the result is that the decimal number 8269, and the 4-bit binary code in one BCD code has the right. From left to right, press high to low in order to be 8, 4, 2, 1, this two-decimal code is a right code. The minimum number of 1 bit BCD code is 0000, the maximum number is 1001.
1.3.5 Foundation of Chinese character information
To make the computer can perform Chinese character information, you must solve the input, storage, output, and encoding conversion of Chinese character information.
The basic process of computer processing Chinese characters is shown in Figure 1.5. Use the Chinese character encoding entered by the keyboard, converted to the interpose of the interpretation of the interpretation through the code conversion program for storage, processing, processing, and conversion, using the input code to the code table. When outputting, use a glyph search program to find a word code indicating this Chinese character in the Chinese character model library, and output it on the display or printer according to the graphic code.
Figure 1.5 Chinese character processing
First, Chinese character encoding
As mentioned earlier, the computer uses the internationally universal ASCII code to encode the letter and symbol. The standard ASCII code is encoded with 7-bit binary number. When the character is stored, one byte is stored, and the highest bit is 0, which can represent 128 characters. Similarly, there is also an encoding problem to the Chinese characters.
The computer is exchanged with the user through the character set including Chinese characters. When processing it by computer, first turn it into a computer-accepted code form, and the final computer handling information must convert internal code into Chinese characters. The shape can be understood by the user.
1. GB2312-80
At the end of the 1970s, my country has recognized that unified Chinese character encoding is very important to the development of computer Chinese information. In order to adapt to the development of computer information processing technology, "information exchange is issued in 1980" Basic Collection " The standard code is GB2312-80).
6763 Chinese characters were included in GB2312-80, and 682 alphabet symbols were included, with 7445 total. These Chinese characters are divided into first-class Chinese characters based on their common levels. 3755 Chinese characters in the first level, in order to pinin, accounting for 99.9% of modern literature Chinese characters; 3008 Chinese characters, in order to preface. At first, the secondary Chinese character accounted for more than 99.99% of the cumulative usage.
The national standard code is based on 94 displayable ASCII code characters, and the two-bytes are encoded by two bytes, i.e., use a continuous two bytes to represent a Chinese character encoding. In order to distinguish between and ASCII, it is specified that the highest bit of each byte is 1. The national code and the ASCII code belong to the same system, and the national code can be considered an extended ASCII code. The Chinese character coding used in my country is currently adopting this standard. GB2312-80 provides, all national standard Chinese characters and symbols form a 94 × 94 matrix. Each line in the matrix is called a "zone", each column is called a "bit". Therefore, there are a total of 94 districts (area: 01 ~ 94), 94 bits per zone (bit number: 01 ~ 94).
The area code and the bit number of a Chinese character are simply combined together to form the "national standard area code" of the Chinese character. In two consecutive bytes, the high byte is a zone number, the low byte is a bit number. Its Chinese characters and symbols are distributed in the 94 × 94 matrix as shown in Figure 1.6, that is, the encoding range of the zone code is: 0101 to 9494. For example, 33 bits of the zone are symbol "×", then "x" can be used in the 29th bit of the area of 0133.41 is the Chinese character "mountain", then enter "Mountain" available location code 4129.
Figure 1.6 Chinese character encoding location distribution map
2. Han crossing machine
The code used to represent Chinese or Western information in the computer system is called the internal code, referred to as the internal code. The ASCII code is a western manufacturer, represented by a byte. The Chinese code is used to represent two consecutive bytes, and the highest bit of each byte is 1.
The range of internal code encoding of GB2312-80 is: A1A1H ~ Fefeh. The relationship between the internal code and location code of the Chinese trip machine is:
Han Dynamus internal code high byte = location code high byte A0H
Han Dynamic internal code low byte = location code low byte A0h
For example, the location code of "×" is 0133, then:
High byte: (01) 10 (A0) 16 = (01) 16 (A0) 16 = (A1) 16
Low bytes: (33) 10 (A0) 16 = (21) 16 (A0) 16 = (C1) 16
Then "×" is the A1C1.
Another example, the location code of "Mountain" is 4129, then:
High byte: (41) 10 (A0) 16 = (29) 16 (A0) 16 = (C9) 16
Low byte: (29) 10 (A0) 16 = (1D) 16 (A0) 16 = (BD) 16
Then, the internal code of the mountain is C9BD.
Windows 2000 built-in internal code input method supports zone code, GBK internal code, and Unicode code input. The user can enter the internal code in the input state of the location code. That is, in the area code state, both inputs 0133 and the A1C1 appear symbol "×".
GBK is a bottleneck problem for Chinese characters extended by Chinese characters. The purpose of GBK is to solve the bottleneck problem of Chinese character information exchange such as Chinese characters, simple and flat, simplifying transformation between code system, which is fully compatible with the GB2312-80 internal code system, and The final international unified double-byte character set standard ISO10646.1 moves. Users can use "full fight" to enter Chinese characters in GBK. When using the "full spell" input method to enter Chinese characters, just enter the pinyin in the case of lowercase letters after switching to "full spell" input state. For example, you want to enter a "俶" word, it can't be entered in the "Standard" or "Five Penmark" input method, but you can find a few pages in "full spell". It should be noted that this Chinese word is not displayed by fonts such as the Song and, but can be displayed in fonts such as the integrity of the Song and, but can be displayed in fonts such as Song, black body, and belongings. Second, Chinese character input code
The Chinese character input method generally has two implementation paths: First, the computer is automatically identified by Chinese characters, requiring computer simulative intelligence; Second, it is manually identified, and the corresponding computer is encoded to manually enter the computer.
There are three main methods for automatic identification:
First, use Chinese character identification technology to write Chinese characters on the induction plate through special handwritten pen; second, use speech recognition technology to input Chinese characters through sound; third is to scan identification input, that is, put it on paper or written on paper Chinese characters enter the computer by scanner, and then convert the input information into the internal code of the input information with the corresponding software. The following mainly introduces the keyboard input method.
The standard keyboard for computer is only dozens of keys, and there are at least thousands of Chinese characters, so use the keyboard to enter Chinese characters, you need to coding Chinese characters. Since the 1980s, hundreds of Chinese characters are currently generated, such as location code, full fight, five-stroke, Microsoft Pinyin, intelligent ABC, etc., they belong to the outer code.
According to the principle of encoding, the Chinese character input code is mainly divided into 4 categories: sequence code (no heavy code), audio code, code, and sound code or shape of the sound code or shape in combination with Chinese characters.
The sequence code is to arrange all Chinese characters in GB2312-80, encodes, such as location code, national code, telegraph code, without having noble. The verbal code refers to the use of Chinese Pinyin, which is a common intelligent ABC, Microsoft Pinyin, full fight, a brief, double fight. Codes are encoded by Chinese characters (such as side, rhythm), such as five-strokes, five strokes. The sound code is the encoding generated by the pinyin and glyphs of Chinese characters, such as natural codes.
Third, Chinese character
Chinese character information is stored in the computer, but it must be converted into a word code when the output is output, and it makes sense. Therefore, for each Chinese character, there must be a model of the corresponding word (referred to as a word) stored in the computer, and the collection of words constitutes a word model library, an abbreviated font. When the Chinese character outputs, you need to find the word model in the font in the bank code according to the internal code, and then output the Chinese character according to the word mode.
Two methods for constructing Chinese characters: vector (vector) method and dot matrix method.
The vector method is to break the Chinese character into a stroke, each of the strokes using a line of straight (vectors) approximately, so that each glyph can become a series of vectors.
Figure 1.7 Chinese characters composed of dot matrix
Dot matrix method is also known as "word mode marks". Each Chinese character is stored in a record medium in a dot matrix. It is a bit "1" and the blank place is "0". For example, "Hang" word can be painted in 16 × 16 squares shown in Figure 1.7, the shape of the "Hang" word is 000100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000. Each line is 16 digits, a total of 16 lines form a word code of a Chinese character, that is, a binary bit 16 × 16 = 256 bits, equal to 32 bytes.
The expression of Chinese characters can also be 24 × 24 o'clock. At this time, it means that 24 × 24 ÷ 8 = 72 bytes. The larger the scale, the more the number of bytes stored in each Chinese character, the more huge in the font. But the better the zigzag resolution, the more beautiful the shape. Most Chinese character information processing systems put the Chinese characters in the disk, such a font, called "soft letter library", all or partially transferred in the internal memory, and transform the internal code from the Chinese trip machine to the corresponding Chinese characters through special software The address of the word mode points is found according to the address.
Relative to the "soft letter library", the Chinese character library in the chip that is cured in EPROM or MASK-ROM is generally called "hard letter library". Generally printers are equipped with integrated circuit chips with cured Chinese characters to increase the speed of output Chinese characters.