Problems in Chinese Match in Perl

xiaoxiao2021-03-06  14

How to use the sentence to divide the Chinese sentence sending station: BBS Shuimu Tsinghua Station (Mon Mar 7 19:40:39 2005), sending an Example: There is another thing. At 5:30, you will meet Mr. Stewart. Extract subsencence1: There is another thing. Subsentence 2: 5:00, you will meet Mr. Stewart. There may be multiple periods in the sentence, then it is divided into multiple. My idea is: 1. Go to the end of the tail point $ t_c_str = ~ s /[))/; ,.?!/ "/ ',.?!" "' '... / n] $ ///; # 去 末 点2. In the judgment, there is no sentence IF ($ T_C_STR = ~ /./) 3. Split @Subsen_c = Split (/. (($ $ T_c_str); there is bug everyone to help support.

ANS

You use the editor that supports Unicode to UTF8 encoding, and then uses the USE UTF8, the Chinese characters are used as the English characters. I don't want to turn this way, I have to use an Encode module: Use encode; $ a = decode ("GBK", "there is one thing. At 5:30, you will meet Mr. Stewart."); $ Dot = decode ("GBK", ". "); @ f = split (/ (? <= $ dot) / b /, $ a); foreach (@f) {print Encode (" gbk ", $ _)," / n ";

转载请注明原文地址:https://www.9cbs.com/read-49608.html

New Post(0)