Unicode vs Ansi Author: dongxiao Published: 2001/02/05
Thesis:
This article describes how to change from the unicode and ANSI formats.
text:
Unicode VS ANSI
The Visual Basic 32-bit version of the string processing is unicode, that is, the string is stored inside the VB in the format of Unicode. What is unicode? Simply put, every franchis is expressed in 2-byte, and each "physical word" is a "character". Therefore, Len ("Hello") The value returned by LEN ("ABC") is 3, because "Big" and "A" are a character. But this is handled for some Chinese characters, such as the information file of the pure text, is a big disaster, because you must locate each character with Byte, but Unicode has smashed everything. For example: LEN ("Good Morning") is transmitted 12, and LEN ("Today's weather is very good") Passing 6 For beginners, it is not easy to use VB to write the program is already a great thing, but immediately On the Chinese processing, I was a boring foot, and the blow was not small. But don't be afraid, in fact, as long as you know more about some instructions, you can solve the problem of Chinese processing. What is the instruction? The most important thing is that StrConv is. The syntax of the STRCONV function is: STRCONV (String, conversion format) where the conversion format is hereby converted to Unicode VbFromunicode to convert the unicode string to ansi after converting a string to ANSI All string processing instructions must be added B, for example: Leftb, Rightb, MIDB, ChRB, INSTRB, LENB, INPUTB, etc. Use these instructions to handle it. When you have finished processing, you can turn it back to Unicode so you can use a general string to handle instructions. Do you understand this? If you still don't understand, look at the example below: [●] Simple use Example See the basic example below You should have some concepts for the string of VB.
Private Sub Command1_Click () Dim sUnicode As StringDim sAnsi As String 'Unicode operation sUnicode = "Xiaoming, A123456789,651023, Hangzhou City Road No. 100, (02) 2345678" Debug.Print Len (sUnicode)' 44Debug.Print Mid Returns $ (Sunicode, 5, 10) 'Back A123456789Debug.print INSTR (SUNICODE, "Hangzhou")' Back 23 'Put the Unicode string to ANSISANSI = STRCONV (SUNICODE, VBFROMUNICODE)' ANSI operation Debug.print lenb ( Sansi) 'Back to 54debug.print Midb $ (Sansi, 8, 10)' Back ?????, because I forgot to Return to UnicodeBug.Print StrConv (MIDB $ (Sansi, 8, 10), vbunicode " Back to A123456789, please pay attention to the action of Unicode must do debug.print INSTRB (Sansi, StrConv ("Hangzhou", vbfromunicode) 'Back 23, don't forget to turn "Hangzhou" to ANSI, otherwise Will not find the end sub [●] in the text file in VB's tip, there is a quick reading method: private submmand1_click () Dim sfile as string open "c: /filename.txt" for Input as # 1SFILE = INPUT $ (Lof (1), # 1) Close # 1END SUB but very unfortunately, if you read the files in the archive, this section will appear incorrect in Input Past End Of File. Because the LOF is transmitted back the number of files, the input is read by the file, which is in Chinese, so that the number of characters in the file will be less than the BYTE number, so there is an error. To solve this problem, we will use the two functions of strconv and inputb: private submmand1_click () Dim sfile as string open "c: /filename.txt" for input as # 1sfile = strconv (INPUTB $ (Lof) (1), # 1), "# 1) The above correction program first uses INPUTB to read the file, but the file read in INPUTB is ANSI format, so you have to transfer it to Unicode with StrConv. [●] Random data files Many text files are in the position of the fixed position group, such as the following information format: Wang Xiaomin 650110 Hangzhou Zhongshan Road No. 100 (02) 1234567 Zhang Daxie 660,824 Guangdong, Dajia Town, Yuhang City No. 23 (03) 9876543 ... how to handle this type of file? This is necessary to use Type and byte Array.