Judging on Chinese characters

xiaoxiao2021-03-06  120

Roughly speaking, it can be simply believed that if the character occupies 1 byte, it is English; 2 is Chinese. It is rough because there are many languages ​​(such as Japanese, Russian, etc.) because of the 2 bytes of characters. However, in general, what we see is these two, so it is not a way. It is very simple: char ch1 = 'a'; string s1 = character.tostring (ch1); byte [] b1 = s1.getbytes (); if (b1.length == 2) System.out.println ("/ n Chinese characters "); else system.out.println (" / N is not Chinese characters ");

There is also a method, the truth is the same: char ch1 = 'a'; string s1 = character.tostring (ch1); byte [] b1 = s1.getbytes (); if (b1 [0]> 127) System.out .println ("/ n Chinese characters"); else system.out.println ("/ n is not Chinese characters");

There is also a third method, more intuitive: CHAR CH1 = 'a'; if (byte) (Byte) CH1! = CH1) System.out.Println ("/ n Chinese characters); Else System.out. Println ("/ N is not Chinese characters");

Of course, if it is some special occasions, it may be necessary to judge more accurately. In this way, I have to refer to the Chinese code table of Unicode. Because: The internal code range of GB2312 is: B0A1-Fefe, so it is realized to change the IF of the first method: if (b1.length == 2 && ((B1 [0] 256) * 256 (B1 [ 1] 256))> = 0xB0A1 && ((B1 [0] 256) * 256 (B1 [1] 256)) <= 0xFefe) System.out.println ("/ n Chinese characters"); ELSE System.out.println ("/ N is not Chinese characters");

In general, the first type of method is sufficient. It is not necessary to hurt God for the so-called "perfect", remember: there is no perfect in the world.

转载请注明原文地址:https://www.9cbs.com/read-124276.html

New Post(0)