Let Java talk! In this article for your Java 1.3 application and applet, Tony Loton shows that less than 150 lines of Java code implement a simple voice engine without using hardware and local calls. In addition, he provides a small ZIP file that contains things that make the Java app that need to need. Just use to entertain or other real applications. This article will be a good introduction if you have just contacted the Java Sound API. (1800 words)
Author: Tony Loton Translator: Cocia Lin
Why do you speak your program? First, for entertainment, this is very suitable for entertainment programs like games. And there are still many serious applications. I think this is not the natural shortcomings of visualization interfaces, but also the sound available - or too much - you can make your eyes leave what you are doing.
Recently, I have applied some technologies to get HTML and XML information on the Web [Take "Access The World's Biggest Database with Web Database Connectivity"]. This makes me combine that work and my idea to create a speech web browser. Such a browser allows you to hear the information on your favorite website - news headline, for example, just like the radio on the way to the outside of the rink or drive. Of course, with the current level of technology, you must bring your laptop and mobile phone, but these unrealistic imaginary in the future, with the emergence of smart phones in Java technology, such as NOKIA 9210 (Is called 9290).
Perhaps, I can use an email reader. This also enjoys JavaMail API. Such a program will regularly check your email address, and your attention is a voice, "You have new email, Do you want me to read it aloud? "Attracted. Close-close, consider speech reminder - when connecting to your daily management program - computer shouting "Don't forget 10 minutes later you and the boss's meeting!"
Go back to these ideas, or you have better ideas, we continue. I will demonstrate how to add the ZIP file I have added to our work, so if you think that these things are too difficult, you can directly install running and skip implementation details.
Test the voice engine In order to use this voice engine, you need to add JW-0817-javatalk.zip to your classpath, use com.lotontech.speech.talker in the command line or Java program.
Command line mode, like this, enter:
Java com.lotontech.speech.talker "h | e | l | oo"
In the Java program, simply contain two lines of code:
com.lotontech.speech.talker talker = new com.lotontech.speech.talker (); talker.sayphoneword ("H | E | L | OO");
Here you may want to know the meaning of the string format "H | E | L | OO" in the command line method or the string format in the SayphoneWord (..) method. Let me explain.
The voice engine relies on the short sound example of the smallest voice unit of the joint person - here is English. These sound examples are called a soundplace, a flag of one, two, or three alphabetic identifiers. Some identifiers are obvious, some are not obvious, you can see such "Hello" from phonology. H - pronounce you can think of e - pronunciation you can think of l - pronunciation you can think of, but pay attention, I will turn two "L" to a "l" OO - "Hello" pronunciation, not "BOT ", Nor" TOO "here lists the sound change that can be used (AlloPhone):
A - For example CAT B - such as CAB C -, for example, CAT D -, for example, DOT E -, for example, BET F -, for example, Frog G -, for example, Frog H -, for example, HOG I -, for example, Pig J - such as JIG, for example K -, for example, KEG L -, for example, LEG M -, for example, MET N -, for example, Begin O -, for example, NOT P -, for example, POT R -, for example, Rot S -, for example, SAT T -, for example, SAT u - such as PUT, for example V - For example, HAVE W - For example, WET Y -, for example, YET Z -, for example Zoo AA -, for example, FAKE AY -, for example, Hay EE -, for example, BEE II -, for example, HIGH OO -, for example, GO BB - Vamination B DD - Vamination D GGG - Variation G HH - Transition H LL - Temperature L NN - Temperature N RR - Variation R TT - Vamination T YY - Temperature Y Ar - For example car Aer - for example, Care CH - For example, WHICH CK -, for example, CHECK EAR -, for example, beer ER -, for example, Later ERR -, for example, LONGER (Longer Sound) NG - For example, Feeding or, for example, Law Ou -, for example, Zoo Ou -, for example, ZOO LONGER SOUND) OW - For example, COW OY - For example, Boy SH - such as Shut th - For example Thing DTH - For example this uh - change U WH - such as WHERE EN - For example, each sentence of Asian people talk There is a change in tricks and changes in words. This tone makes it feel natural, feeling feeling, and can determine from sentences. This is a question. If you have heard the artificial sound of Stephen Hawking, you will understand what I said. Consider these two sentences:
IT IS FAKE - F | AA | K IS IT FAKE? - F | AA | K You may guess, using the upgrade method is to use uppercase letters. You have to feel it, my tips are you pay attention to the 元 元 字 字
This is what you need to know using this software, but if you are interested in things under the hood, then continue to read.
Implement voice engine
The voice engine only requires a class to implement, including four ways. It uses Java Sound API of J2SE1.3. I don't want to provide a comprehensive Java Sound API tutorial, you will learn from example. You will find that there are many need you need to do, and you can tell you what you need.
Here is the basic definition of the Talker class:
Package com.lotontech.speech;
Import javax.sound.sampled. *; import java.io. *; import java.util. *; import java.net. *;
Public class talker {private sourceDataLine line = null;}
If you run the program from the command line, the following main (..) method will serve as an entry. It gets the first parameter of the command line, if there is a parameter, will pass to the SayphoneWord (...) method: / ** This method pronounces a specified word on the command line * / public static void main (String args []) {Talker Player = new talker (); if (args.length> 0) Player.SAYPHONEWORD (Args [0]); system.exit (0);
Above, the SAYPHONEWORD (...) method is called by the main (...) method, or it is directly called by the Java program or Applet. It looks difficult to understand than it itself. Essentially, it is a simple one-step to explain the words of the word. ALLOPHONE - Enter text by "|" Sign Segmentation - the output channel output through the sound output channel. In order to let it sound more natural, the end of each sound is merged to the beginning of the next sound:
/ ** This method makes the word pronunciation * / public void self "{// set a byte array for the previous sound [] previoussound = null;
// Segment input string StringTokenizer st = new StringTokenizer (Word, "|", false; while (st.hasmoretoKens ()) {
Construct a file name for voice unit String thisphonefile = st.nextToken (); thisphonefile = "/ allophones /" trisphonefile ";"
Get data from the file byte [] thissound = getSound (thisphonefile);
IF (previoussound! = null) {
Merge the previous voice and the current INT mergecount = 0; if (previoussound.length> = 500 && thissound.length> = 500) MergeCount = 500; for (int i = 0; i Playing the previous note Plays (Previoussound); Cutting the current note as the previous note Byte [] news = new byte [trissound.Length-mergecount]; for (int II = 0; II // Play the final sound and refresh the sound channel Plays (Previoussound); DRAIN (); At the end of SayphoneWord (), you see that it calls Playsound (..) to output a separate sound example and call DRAIN (..) to refresh the sound channel. Here is the code of Playsound (.): / ** Play a sound * / private void Plays (byte [] data) {if (data.length> 0) line.write (data, 0, data.length);} Drain (..) code: / ** Refresh Sound Channel * / Private Void Drain () {if (line! = Null) line.drain (); try {thread.sleep (100);} catch (exception e) {}} Now, if you look back at the SayphoneWord (..) method, you will find that there is still a way we have not mentioned: getSound (..). GetSound (..) is engaged in reading the byte data that reads the sound in the AU file. When I said the file, I referred to the resources I have provided in the ZIP file. I emphasize this difference because you get JAR resource control - using getResource (..) method - this is different from the control of a normal file. In order to have a voice-sounded read data, convert to sound format, instantiate a sound output line (why they call it SourceDateline, I don't know), combine these byte data, I will provide you in the following code: / ** This method reads a separate voice from the file and constructs a byte vector * / private byte [] getSound (String filename) {Try {url url = talker.class.getResource (filename); audioInputStream Stream = Audiosystem. GetaudioInputStream (URL); Audioformat Format = stream.getformat (); Convert an ALAW / ULAW sound to PCM IF ((Format.Getencoding () == Audioformat.Encoding.ulaw) || (Format.Getencoding () == Audioformat.Encoding.alaw) {audioformat tmpformat = new audioformat (Audioformat) Encoding.pcm_signed, format.getsample (), format.getsamplesizeinbits () * 2, Format.getChannels (), format.getframesize () * 2, format.getframerate () (); Stream = audiosystem.getaudioInputStream (TmpFormat, Stream); Format = TmpFormat;} Dataline.info info = new dataline.info (Clip.class, Format, ((int) stream.getframelength () * format.getframesize ())); IF (line == null) {// - Output line not instantiated yet - // - can We Find A Suitable Kind of line? - Dataline.info outinfo = New Dataline.Iutinfo = New Dataline.Class, Format; IF (! ") {system.out.println (" line matching " outinfo " not supported. "); throw new exception (" line matching " outinfo " NOT Supported. ");} Open the resource data line (output line outprut line) line = (sourceDataLine) Audiosystem.getLine (outinfo); line.open (Format, 50000); line.Start (); // Some dimensions calculate int framesizeinbytes = format.getframesize (); int buffengthinframes = line.getBuffersize () / 8; int buffengthinBytes = bufferLengthInframes * framesizeinbytes; Byte [] data = new byte [bufferLengthinbytes]; / / Read the data byte and calculate int numBytesRead = 0; if ((NumBytesRead = stream.read (data))! = -1) {int numBytesRemaining = NumBytesRead;} // Cropped byte array to the correct size byte [] newdata = new byte [NumBytesRead]; for (int i = 0; i Return newdata;} catCH (Exception E) {Return New Byte [0];} Ok, it is so much. A 150 row of speech synthesizer code, including description. But this is not completely over. Text-to-Speech transitions use speech methods to represent words that may be too bored, so if you want to create an app I am introducing the same application, you have to provide the original text. After studying this topic, I provide an experimental text to the conversion class in the zip file. When you run it, you will be output to you the voice you want. Run the Text-to-Speech Converter in the command line mode: Java com.lotontech.speech.converter "Hello there" The output you see is similar to the following: Hello -> H | E | L | OOTHERE -> DTH | AER Or, run it like this: Java com.lotontech.speech.converter "I like to read javaworld" See (and hear) these: i-> iilike -> l | ii | kto -> t | uuread -> r | ee | a | djava -> j | a | v | aWorld -> W | ERR | L | D If you want to know how it works, I will tell you that my method is simple, and a set of text replacement rules for usual order. There are several example rules that you might like to apply spiritual, order, these examples are "Ant", "Want", "Wanted", "unwanted", "unique": Replace "* unique *" Use "| Y | OU | N | EE | K |" Replace "* Want *" Use "| W | O | N | T |" Replace "* a *" Use "| A | Replace "* e *" Use "| e |" "* D *" Use "|" "* n *" Use "|" Replace "* U *" Use "| U |" Replace " * T * "Use" | T | "" unwanted "order will be like this: Unwantedun [| | n | t |] ed (rule 2) [| u |] [| n |] [w | o | n | t |] [| e |] [| D |] (Rules 4, 5, 6, 7) U | N | W | o | N | T | E | D (with surplus characters remoded) You should see that Want containing Ant is read aloud in several different ways. You should also see the special circumstances for UNIQUE, should be read as Y | OU .. rather than u | n .... The elves in the computer, talk to you this article, provides a simple voice engine that can run using Java 1.3. If you study these code, you can get some useful methods about the Javasound API playback audio. To make this engine really can be used, you have to think about the conversion method of text to voice, this is really my major idea. In this engine, you have to think of a large number of text conversion rules, but also apply some good priority. I hope that you have perseverance than me. Finally, you may still remember the Nokia 9210 I said. I have some, it supports Java, I decided to speak it with Java. I also want to make the applet (previous version of Java2) in the browser. These techniques rely on J2SE 1.3 sound engine, now available. A different approach is required, relying on a simple Java Audioclip interface. Not as simple as you think, but I work on it. About the author Tony Loton works for his company - Lotontech Limited - provides software solutions, consultants, training and technical writing services. The bugs of writing seem to be biting him in this year, and he wrote a book for John Wiley & Sons and Wrox Press. About translator Cocia Lin (cocia@163.com) is a programmer. It has a bachelor's degree, now specializing in Java related technologies, just starting to toss in the computer field. related resources