Some feelings of doing voice recognition!

xiaoxiao2021-03-05 53

I feel very mysterious from the beginning, I feel that it is a mysterious! There is also a year in now! Now I think about what I have learned in this year? The upper layer encoding of the voice from the beginning is the SPEECH SDK, the voice software development package. It is encapsulated throughout the algorithm of the identification of speech, retains an interface function, and is used for secondary development. Let's introduce the Speech SDK. IBM's VIA VIOCE voice system and Microsoft's SPEECH SDK provide a secondary development platform for speech recognition and synthetic, which can identify multiple languages, such as English, Chinese, Japanese. We can use their software to embed voice recognition and synthesis function in the software you have developed. Several companies such as IBM Microsoft provide speech recognition and synthetic secondary development platform, but only Microsoft is free. Microsoft's recognition system is not too high in continuous speech recognition, but in command control mode, it can meet the requirements of voice control applications. Microsoft Speech SDK 5 .1 fully supports the development of Chinese voice applications, providing speech recognition and synthetic engine-related component application layer interface Xiang fine technical information and help documentation. It uses the COM standard development. The underlying protocol is completely independent of the application layer in the form of COM components. The complex voice technology is shielded to the application designer, fully embodies the advantages of COM, that is, a series of voice-related work by COM components. Completion: Speech Identification is managed by the recognition engine (Recognition Engine), voice synthesis is responsible by the Synthesis Engine; programmers only focus on their own applications, call the relevant voice application interface (SAPI) to implement voice function. I have used SPEECH SDK to have some small programs, I feel that Microsoft's identification effect is ok, but the identification is generally vocabulary, and sentence identification has not been tried. But I have seen other articles that IBM's ViaVoice's identification effect is better than Microsoft! The development of speech recognition with SDK is programming, and it is not necessary to watch the skills of the program, and the voice recognition is not a matter! So use it to do, can't be considered speech recognition!

转载请注明原文地址:https://www.9cbs.com/read-34047.html

9cbs

New Post(0)