Difficulties of text excavation of words and word frequencies ---- About word and word frequencies find a Java statistical word frequency, found to be used at all. For English, these two things are simple, but it is really difficult for Chinese, because there is a space between the words and words, and Chinese have not, so this program will regard Chinese a word as a one as one Words, this is meaningless. If you want to extract Chinese words, you may have to do more work, and there will be a lot more complicated. I don't know how to do it (now), but there is a lot of results. If there is no Baidu, it is important to think that Baidu is really important for Chinese. In turn, the computer is not invented by the Chinese. The computer is based on the logic of the West, which makes our Chinese culture make a lot of efforts to adapt to it ........ Maybe our language is really no Suitable for symbol processing. But try to do interface between computer and Chinese, good interface ........
This is really going to spend a lot of time, the effect may not be good .........