I applied for a free QQ number a few days ago, I suddenly found that the verification code content in the application form was changed to Chinese. This is really a big fell eyeglass, and the cats on the MOPER have a large-scale Tencent using Chinese verification code. ^ _ ^ I have to admire Tencent to prevent the Chinese verification code for the current QQ number automatic registration machine from being rampant on the network. Careful thinking about how to use the program to generate a random Chinese verification code is not very difficult. Let's introduce the principle of using C # generation random Chinese Chinese characters.
1. What should I do if the principle of Chinese character coding is randomly generated? Where is the Chinese characters come from? Is there a background data table, which store all of the Chinese characters needed, and use the program randomly remove several Chinese character combinations? This is also a way to use the background database to save all Chinese characters, which is also a way, but Chinese Chinese characters have so many, how can I make it? In fact, you can do this without any background database, you can use the program. To know how to generate Chinese characters, you must first understand the principle of the coding of Chinese characters. In 1980, in order to make each Chinese character have a national unified code, my country promulgated the first Chinese character code: GB2312-80 "Information exchange with Chinese characters coding character set", referred to as GB2312, this character set is my country The development foundation of Chinese information processing technology is also a unified standard of all Chinese character systems in China. It has been announced later, the national standard GB18030-2000 "Extension of Chinese characters coding character set", referred to as GB18030, if programming, if you involve encoding and localized friends should be familiar with GB18030. This is the most important Chinese character coding standard after GB2312-1980 and GB13000-1993, and is also one of the basic standards that the computer system must follow in the future. Currently in the Chinese Windows operating system, the default code page in .NET programming is GB18030 Simplified Chinese. But in fact, if the Chinese Chinese character verification is only required to use the GB2312 character set. In addition to the Chinese characters we all know, there are many Chinese characters we don't know, and they don't know much. If there are many Chinese characters we don't know in the Chinese characters verification code, let us enter, for friends who use the pinyin input method, it is not a good thing, five users can still play out according to the long words of Chinese characters, huh, huh! So Chinese characters in GB2312 characters are not all used. Chinese Chinese characters can be expressed by the area code, see
Chinese zone bit code table http://navicy2005.home4u.china.com/resource/gb2312tbl.htm Chinese character zone code code table http://navicy2005.home4u.china.com/resource/gb2312tbm.htm
In fact, the two tables are the same thing, but only one use of a hex partition representation, a digital location where the location is located. For example, the hexadecimal code of "Good" is Ba C3. The first two are the area, the latter two representative positions, Ba is in the 26th district, "Good" in this area, the 35th, the 35th place, the 35th place is C3 Location, the digital code is 2635. This is the principle of GB2312 Chinese characters. According to the "Chinese Zone Bit Code", we can find that the 15th district is that the AF area has no Chinese characters. Only a small number of symbols, Chinese characters start from the 16th District B0, which is why GB2312 character set starts from 16 district.
2, .NET program Processing Chinese character encoding principle analysis can use System.Text in .NET to process encoding of all languages. In the System.Text namespace containing numerous encoded classes, you can operate and transform. The ENCoding class is a class that focuses on Chinese character encoding. By querying the Encoding class in the .NET document we can find that all and text codes are all byte arrays, two of which are well used:
Encoding.getbytes () Method The full or part of the contents of the specified String or character array is the byte array eNCoding.getstring () method to decode the specified byte array to a string. That's right, we can use these two ways to encode the Chinese character character as a byte array, and also know the byte array coding of the Chinese character GB2312 can also decode the byte array as a Chinese character. After encoding the "good" word, after the byte array
Encoding GB = system.text.Encoding.Getencoding ("GB2312"); object [] bytes = gb.encoding.getbytes ("good");
I found a number of bytes of length 2 bytes bytes, use
String lowcode = system.convert.tostring (Bytes [0], 16); // Remove Element 1 Coded Content (Two - One 16 Enter) String Hightcode = System.convert.toString (Bytes [1], 16); // Take out element 2 encoded content (two-digit 16)
After that, the contents of the byte array BYTES16 began to make {ba, c3}, just a "good" hexadecimal area code (see area code table). Therefore, we can randomly generate a hexadecimal array of lengths 2, and use the getString () method to decode it to get the Chinese characters. However, for the generated Chinese Chinese character verification code, because the 15th district is that the AF area has no Chinese characters before, only a small amount of symbols, the Chinese characters start from the 16th District B0, and the Chinese characters after the location D7 will be the same, it is very difficult. Both Chinese characters, so these are discharged. Therefore, randomly generated Chinese character hexadecimal code No. 1 bit is between B, C, D, if the first bit is D, the second bit code cannot be 7 after seven hexadecimal numbers. Take a look at the area code table found that the first position of each zone is empty, no Chinese characters, so the third bit of the randomly generated location code, if it is a, the 4th is not 0; If the 3 digits are f, the 4th is not F. Ok, I know the principle, the process of randomly generate Chinese Chinese characters will come out, the following is the C # console code that generates 4 random Chinese characters:
3, program code:
Using system; using system.text;
Namespace ConsoleApplication {class chinesecode {public static void main () {// Get GB2312 Code Page (Table) Encoding GB = Encoding.Getencoding ("GB2312");
// Call the function to generate 4 random Chinese Chinese characters encoded object [] bytes = createRegioncode (4);
/ / Decoding Chinese Chinese characters string str1 = gb.getstring according to the byte array of Chinese character encoded (Bytes [0], TypeOf (bytes [0])); string str2 = gb.getstring (Byte []) Convert.ChangeType (Bytes [1], TypeOf (byte []))); string str3 = gb.getstring (byte []) Convert.ChangeType (Bytes [2], Typeof (byte []) ))); string str4 = gb.getstring ((byte []) Convert.ChangeType (Bytes [3], Typeof (byte [])));
// Output console.Writeline (str1 str2 str3 str4);} / ** // * This function randomly creates an array of sixteen-based bytes including two elements in the Chinese character encoding range, each The byte array represents a Chinese character and stores four bytes arrays in the Object array. Parameters: Strlength, representing the needs to generate Chinese characters * / public static object [] createRegioncode (int str hard "{// Define a string array stored Chinese character encoding element String [] RBASE = new string [16] {" 0 "" 1 "," 2 "," 3 "," 4 "," 5 "," 6 "," A "," B "," C ", "d", "e", "f"};
Random rnd = new random ();
/ / Define an Object array to use Object [] bytes = new object [Strlength];
/ ** // * Each cycle produces a hex byte array containing two elements, and puts it in the BJECT array, each Chinese character has four zone code to form a location code 1st and location code 2-bit as the first element code number 3 and the location code 4th bit as the third bit of the byte array * / for (int i = 0; I // location code 2nd RND = New Random (R1 * Unchecked ((int) datetime.now.ticks) i); // Replace the random number generator Seed avoidance of duplicate value int R2; if (r1 == 13) {R2 = rnd.next (0, 7);} else {r2 = rnd.next (0, 16);} String STR_R2 = RBASE [R2]. Trim (); // Location code 3rd RND = New Random (R2 * Unchecked ((int) DateTime.now.ticks) i); int R3 = rnd.next (10, 16); string str_r3 = rbase [r3] .trim (); // location code 4th RND = new random (R3 * unchecked ((int) DateTime.now.ticks) i); int R4; if (R3 == 10) {R4 = rnd.next (1, 16) Else IF (R3 == 15) {R4 = rnd.next (0, 15);} else {r4 = rnd.next (0, 16);} String STR_R4 = RBase [R4] .trim (); / / Define two byte variable storage generated random Chinese character code BYTE BYTE1 = Convert.Tobyte (STR_R1 STR_R2, 16); Byte Byte2 = Convert.Tobyte (STR_R3 STR_R4, 16); // Put two bytes Variables are stored in byte arrays Byte [] str_r = new byte [] {byte1, byte2}; // The byte array of one Chinese character will be placed in the Object array bytes.SetValue (STR_R, I); } Return Bytes; } }