Develop Chinese Voice Applications through IBM WebSphere Voice Toolkit

xiaoxiao2021-03-06 29

content

The software setting before the preface development creates a Voice project to create a voiceXML file to create a spoken grammar file overall test about the author

Liu Xuezhe (acmeliu@sohu.com) Shuiya Net E-Commerce Development Services Co., Ltd. Technical Director In November 2002, in this tutorial we will use IBM's WebSphere Voice Toolkit to complete a Chinese voice application. Before you use this tutorial, you must install the WebSphere Voice Server SDK required by IBM on your computer, which is the voice Toolkit, which is the title of this tutorial title. The Chinese language tester is provided in WebSphere Voice Server SDK and WebSphere Voice Server used in development testing. You can download this URL: www-4.ibm.com/software/speech/enterprise/ep_11.html Before you start tutorial, I hope that your elements that appear in VoiceXML have a general understanding, so that You can understand the meaning of symbols that appear in this tutorial code. If you have already known VoiceXML before this, how it doesn't have an intuitive experience. I believe that the programs in this tutorial will disperse your many questions in developing Chinese voice applications. We created a voice fast food app, and the customer can complete the order of fast food food through this voiceXML voice application. Then you will ask the current phone automatic response system to do the corresponding service function, why we need to develop a VoiceXML voice application. The most intuitive experience in this program is different from the VoiceXML application and the traditional telephone response system in that the traditional telephone response system is usually issued by DTMF. VoiceXML-based voice applications can transmit related instructions by non-specific people's voice. So the voice application developed using VoiceXML technology has a qualitative leap in the user interface relative to traditional voice applications. You will have a deeper experience in the speech program test behind the tutorial. The software setting before the development of the first time before using WebSphere Voice Toolkit, we need to do a few things. First, please run the Audio Setup - Simplified Chinese in WebSphere Voice Server SDK, this program is Detect your microphone sensitivity and noise in your surrounding environment meet the minimum requirements of the development environment. If you pass this test, you can correctly identify your spoken instructions when you spend the development test, in fact, for VoiceXML, application of non-specific person speech recognition technology for environmental noise, which voice input software relative to It is quite low.

Figure 1, voice development environment test

Second, it is also the voice application you develop to issue a Chinese TTS voice in the test environment. Start the Voice Toolkit and select Menu Window => Preferences, the Voice Language in the Voice Toolkit is selected as Simplified Chinese in the Voice Toolkit on the Open Parameter Options. If you develop other language versions in the same software, change this to the corresponding language so that the system can output TTS voice.

Figure 2, Changing pronunciation language

Creating a voice project Now we officially start creating a Chinese voice application, first of all, create a voice project. In this project we can create the following files: VoiceXML: In this file we will create input and outputs for voice applications and related event processing. Grammar: Create a syntax for voice recognition in such files. Such as: JSGF Grammar File, SRCL Grammar File. PRONUNCIATION: Create a word pronunciation standard in such a file. Such as: Pronunciation Pool file, Pronunciation Exception Dictionary. Voice Toolkit provides the corresponding editor and test environment for the above types of files. 1. Select File => New => Voice Project 2. Enter the item name you will create in Project Name in the project window, the name in this tutorial is Chinesevoice 3. Click Finish to complete the creation work Figure 3. Create a Voice Toolkit project

Create a voiceXML file In this tutorial we will create a voiceXML file, in which we will implement the VoiceXML language input and output and basic event processing. First let's create a voiceXML file: 1. Select File => New => VoiceXML File 2, enter the name of the file you will create in the VoiceXML File Name in the New VoiceXML File window, the name in this tutorial Welcome.vxml 3 Select Finish to complete the creation work

Figure 4, create a voiceXML file

Now you can write your voice application in the VoiceXML editor.

Figure 5, VoiceXML file editor

Let's first take a look at the code used by the VoiceXML file in the voice-making shop application we created

I can't hear anything , I can't hear anything.

I didn't listen to it, please repeat anywhere

Welcome to Voice Services Fast Food Base

Are you a Chinese fast food or Western fast food

We have a buns, dumplings, tofu brain, fried cake, and noodles, you want to eat it

We have a hamburger ,,,,,,,,,,,,,,,,,,

Your order has been confirmed that you are welcome to visit again.

Event processing elements In VoiceXML files we first joined the event processing element at the beginning of the code Sorry, I can't hear anything

I didn't listen to it, please repeat anywhere

The role is that the user has no instructions for a long time, the voice service program will issue a corresponding prompt to remind the user, and the system will issue "Sorry, I can't hear anything" in this program. The role of VoiceXML's grammar element is the possible spoken voice input, If the input setting value is identified, a string value is returned. The set syntax can be the user-defined syntax that is downloaded or directly inserted from the external file, or the built-in syntax. The Grammar element provides six built-in syntax types. Includes: Boolean, Data, Digits, Currency, Number, and Phone. The value of the SRC attribute in the grammar element is a URL containing a spoken grammar file. We have developed a Food.jsgf as an external spoken grammar file called by this voice program. How to create a food.jsgf This JSGF spoken grammar file I will introduce it later. Grammar's TYPE attribute refers to the format of the syntax. Dialog In this VoiceXML file we define a form is that all FORM elements have occurred in this voice application in this form.

......

The ID attribute in the FORM element is used to log form. In this VoiceXML file, we set the value to the form value of Welcome. The value of this ID attribute can be used when goto, submit and other elements are used from another dialog to this form. Form items in this VoiceXML file we used this block Welcome to Voice Services Fast Food Base

.......

Your order has been confirmed that you are welcome to visit again.

The block element is a form item, which is packaged by the value of the variable, browsing, and declaration. This property is used in the Block element in our VoiceXML file and is called for other elements of the GOTO of the code block for it to take the value of the value of EXIT. In this VoiceXML file we also use another important form, it is about this is the field element with the user's input.

........

The Field element is used to receive speech recognition information from the user or DTMF information. The voice application we created in this VoiceXML file will receive voice command information from the user. Such as: When the system issues a Chinese fast food or a Western fast food, the user may issue a corresponding voice command. The NAME attribute of the Field element is a variable used to store the user's input information. The variable value of the Name property we set in the upper code is ChineseFood, which is the same as the of the JSGF syntax we write in the back tutorial. Conditional Logic and Transfer In this VoiceXML file we used two conditional logic elements of IF and Elseif and Goto Transfer Elements, see the following code.

The two elements of IF and Elseif can be used for conditional logic, and everyone should remember that ELSE is always encapsulated in the IF element. An IF element can include multiple Elseif elements. The Cond property is an expression, and its value can be calculated as true, or it can be calculated as Flase. If the value of a COND attribute is found in the IF element, it is calculated to calculate the expression of the next Elseif Cond. In the VoiceXML voice application we created, when answering the "Western" voice command is TRUE. Program call GOTO this transfer element, the need for the NEXTITEM property is that when the goto element is used to navigate to another item from a project to another, this property will be used. Building a spoken grammar file is now ready to establish a JSGF spoken grammar file. JSGF is an abbreviation for Java Speech Grammar Format. It is a grammatical format independent of the platform for speaking recognition. 1. Select File => New => JSGF Grammar File 2, enter the name of the file you will create in the JSGF grammar file name in the New JSFG Grammear File window, the name in this tutorial is Food.jsgf 3, select Finish to complete the creation jobs

Figure 6. Create a JSGF grammar file

Now you can write your spoken grammar files through the JSGF grammar editor.

Figure 7, JSGF grammar file compiler

Let's first take a look at the code used by the JSGF file in the Voice Created Store application. #JSGF v1.0;

Grammar food;

Public = {this.foodkind = $}

| {this.chinesefood = $}

| {this.westfood = $};

= Chinese | Western-style

= Hamburger | French fries |

JSGF's file structure is relatively simple, mainly with file headers and intermediate syntax body parts. The meaning of the symbol appears in the code represents the syntax judgment selection. Such as: = Hamburger | French fries | Division When we have completed the writing of the spoken grammar file, you can test the corresponding spoken grammar test through the Grammar Test Tool provided by Voice Toolkit. This test tool provides a test of voice and textual situation. After the JSGF grammar file after the oral voice test, let us understand the ability of speech applications to identify spoken language and pronunciation in advance to reduce the difficulty of the late speech application test. .

Figure 8, Syntax Test Tool

Overall test Here we have completed all code used in our voice fast food restaurant, let's officially start the test of the entire VoiceXML voice application. First, activate the Welcome.vxml of this editor form, then select Run => Run in Audio Mode. At this time we launched the voice test mode of VoiceXML. (Voice Toolkit provides four test modes for VoiceXML, which is Run in Audio Mode, Run In Text Mode, Debug In Text Mode. I recommend that you choose Debug test mode when you test the VoiceXML application. This makes it easy to track the process of the VoiceXML voice application to make timely code adjustment.) Voice Fast Restaurant Application Council SHOP: "Welcome to Voice Services Express Hotel"

SHOP: "Do you want Chinese fast food or Western fast food?

User: "Chinese"

SHOP: "We have a buns, dumplings, tofu brain, fried cake, and noodles, you want to eat it."

User: "Bun"

SHOP: "Your order has been confirmed to welcome you again"

The above dialogue comes from the test process. If you have made a similar dialogue with the above dialog in the VoiceXML voice application test, and when you hear "your order has confirmed you to visit again" I have to congratulate you now that you have successfully completed the development of this Chinese voice application based on VoiceXML technology.

转载请注明原文地址:https://www.9cbs.com/read-69619.html

9cbs

New Post(0)