The processing of audio under Windows can be roughly divided into two parts, namely audio input, output, and ACM compression processing.
Under normal circumstances, an API (MCI) such as SndPlaySound can be called to play a WAV file, but it is obvious that we need to do. The audio stream must be processed directly. Under Windows, a series of APIs are also provided, a set of APIs starting at Wavein and Waveout is doing this.
Let's talk about it first. Commonly used API for WaveinOpen (Open an Audio Enterprise), WaveInPrepareHeader (for an input buffer in WaveinAddBuffer, WaveinAddBuffer (add an input, data buffer), WaveInstart, Several WaveInClose, and a callback function or thread that needs to be specified in WaveInpen, which is called to be called after a data buffer is recorded to process these data, and others. Operation.
First of all, you have to determine what callback mode you need, that is, after the audio data of a certain time piece is recorded, Windows will activate the processing of these data through this callback, generally useless is Function, Thread and Event These types, and more convenient and simple is Function and Thread. Function mode means that Windows calls you this function, while Thread is Windows to activate the thread you specified. These are specified in WaveinOpen. Its function prototype is:
MmResult Waveininken (lphwavein phwi, uint udeviceid, lpwaveformatex pwfx, dWord dwcallback, dword dwcallbackinstance, dword fdwopen);
Where: PHWI is the returned handle storage address, the udeviceID is the audio device ID number to open, generally specified as Wave_Mapper. DWCallback is the address of the specified callback function or thread, and fdwopen specifies the callback mode, and DWCallbackStance is a user parameter that needs to be sent to the callback function or thread. As for that PWFX, the key is the key. It specifies how to open the audio input device in what audio format, it is a structure WAVEFormatex:
Typedef struct {word wordettag; word nchannels; dword nsampletes; word nblockAlign; Word wbitsPersample; Word Cbsize;} Word CBSIZE;
Audio compression is selected when the Win9x installation on the machine can specify some compressed audio formats in WFormattag, such as G723.1, Ture DSP, and the like. However, it is generally selected for Waveformat_PCM format, ie uncompressed audio format, as for compression, can be called after the recording is completed, and the ACM to be told separately.
NChannels is the number of channels, 1 or 2. NSAMPLESPERSEC is sample number per second, 8000, 11025, 22050, 44100 is several standard values, and other non-standard values I have not tried to do. NavgBytesperSec is averaged byte per second, which is equal to nChannelsample * nsamplespersec * WbitsPersample / 8 in the PCM mode, but for other compressed audio formats, because many compression methods are carried out by time, such as G723.1, With 30 ms as a compressed unit, this, NavgBytespersec is just a probably digital, not accurate, the calculation in the program should not be subject to this quantity. This is very important in the following compressed audio output and ACM audio compression. NBlockAlign is a relatively special value indicating the minimum processing unit when audio processing. For PCM non-compression, it is WBITSPERSPERPLE * NCHANNELS / 8, and for non-compressed format, the minimum unit of compression / decompression processing is indicated, such as G723 .1, is 30ms of the data size (20bytes or 24bytes). WbitsPersample is the number of digits, 8 or 16 per sample value. CBSIZE means how many bytes of the WAVEFORMATEX after the standard head, for many non-PCM audio formats, have some own definition format parameters, followed by the standard WaveFormatex, its size Specified by CBSIZE. For PCM format, 0 or ignore it. In this way, after these parameters are specified, you should open the audio input device. The thing you have to do is preparing a few buffers that use the recording. Multiple buffers are often prepared and cycled in the callback. In addition, you have to consider the audio data that recorded, such as a temporary file, you have to prepare the handle of the file. For the buffer, you have to use WaveinPerPareHeader to prepare your head, this API is relatively simple, if you are cycled using buffers, you only need to call a WaveinPrepareHeader for each buffer.
After everything is ready, you can call WaveinAddbuffer and WaveInstart to start recording, as long as you call this WaveInstart, the recording begins, even if this buffer is recorded, you didn't join the new buffer, the recording will not stop, just This intermediate speech data is all lost. When the buffer sent by WaveinaddBuffer is recorded, Windows will call the way you specified in Waveinopen, take it out of the recorded voice data, and, if you want to continue recording, you will The next buffer is added. Considering this processing is time delay, and the audio is very sensitive, generally must first prepare several buffers, for example, a total of 8 buffers are defined, and for the sake of insurance, it is best to ensure at least one at least There are 3 buffers to be recorded, then when starting recording, first add 4 buffers, and then in the callback, if the current buffer nth, subordinate (N 4)% 8 Waveinaddbuffer, at this time, there are items (N 1)% 8, (n 2)% 8, (n 3)% 8 These three buffers available, that is, basically guaranteed the recorded audio recorded There is no disconnection interval.
When you want to end recording, it is best to call a WAVEINRESET before WaveInClose so that you can clear the buffer still waiting for the recording, and you must pay attention to the type of message in the parameter in the callback. The audio output is relatively simple. The corresponding API has Waveoutopen, WaveoutPrepareheader, WaveoutWrite, and WaveoutClose. If you want to directly output audio of the compressed format, you must pay attention to the audio format parameters specified in Waveoutopen. You must be aware of the specific parameters of this type of format and its meaning. However, you can get the specific parameters of the audio format you need by the ACM (Audio Compress Manager), which can be used directly for Waveoutopen. WaveoutPreparehead is also required if the input is input. WaveoutWrite is filled in the output buffer. In order to avoid interruption, it should also ensure that the number of buffers queues in a certain moment is sufficient.
If you install WIN98, the audio compression is selected in the attachment, then the ACM on the machine can be used. ACM is Audio Compress Manager. Win98 provides some common audio compression algorithm packages for user calls. All audio compression drives on this unit can be obtained via ACM and the audio formats thereof. However, it seems that each ACM format can be called to compress, but most of the compression drives in the ACM are for voice bands, if used to compress the frequency bandwidth audio, such as music, the effect is very poor.