Application example of WAV file format
Multimedia technology has developed very fast in recent years, and better quality sound cards can provide 16 stereo and 44kHz play recording capabilities. It not only provides the original sound, the synthetic sound quality is also very ideal, some sound cards have also added numbers. Signal processor, programmable DSP has powerful calculation capabilities, which can be used as a compression of sound information and processing of some special effects. The voice information provided by the WAV file provided by the sound card with this function can meet the requirements of voice feature recognition. 1.1 Riff files and WAV file format In a Windows environment, most of the multimedia files are stored in a structure, which is called "Resources Lnterchange File Format," referred to as Riff. For example, the WAV file of the sound, the av1 file, etc. of the video, and the like are all derived from this structure. RIFF can be seen as a tree structure, which is based on CHUNK, just a node in a tree structure, each CHUNK consists of "identification code", "data size" and "data". The identification code is composed of four ASCII codes, and the data size indicates the length (by Byte of the data), and the data size itself uses four Byte, so in fact, the length of the CHUNK is added to the data size. 8. In general, CHUNK itself does not allow internal contained CHUNK, but there are two exceptions, which are CHUNKs with "Riff" and "L1ST" as the identification code. In response to these two kinds, Riff is cut out from the original "data". This 4 BYTE is called "format discrimination code", but Riff is also specified in the file with only one CHUNK with "Riff" as an identification code. As long as we follow the file file, we are all called RIFF files. This structure provides a systematic classification. If compared to MS a DOS file system, "Riff" chunk is like a hard disk root directory, and its format discriminating code is the logic code of this hard disk (C: or D :), and "L1st" chunk is The subdirectory under which other Chunk is a general file. As for the processing of the RIFF file, Microsoft provides related functions. Various multimedia file formats under the window is as specified in the disk computer, which can only be placed on this directory. WAV is abbreviated for Waveform (Waveform). The structure of the sound file is shown in Figure 1, "Riff" format discriminating code is "Wave". The entire file consists of two chunks: the identification code "FMT" (note, the last one is blank character!) And "data". Under "fmt" The chunk contains a PCMWAVEFORMAT data structure, which is defined as follows: typedef struct pcmwaveformat - tag {WAVEFORMAT wf; WORD wBitsPerSample;} PCMWAVEFORMAT; typedef struct waveformat - tag {WORD wFormatTag; WORD nChannels; DWORD nSamplesPerSec; DWORD nAvgBytesperSec Word nblockalign;} Waveformat; its meaning is: wFormattag: The format code recorded in this sound, such as Wave_Format_PCM, Wave_F0RAM_ADPCM, and more.
NChannels: The number of channels recorded. NSAMP1ESPERSEC: Records samples per second. NavgBytespersec: Record the amount of data per second. NBlocka1ign: Record the alignment unit of the block. WBITSPERSAMPLE: Record the number of bits required for each sampling. "Data" Chunk contains real sound data. WINDOW currently only provides a data format of Wave_Format_PCM, which is the meaning of the pulse coded, and the PU1SE Code Moduction. For this format, Windows defines the storage situation of data in "DATA" CHUNK, which lists four different channel numbers and the number of bit elements required for sampling, and the position of the bit element location. "RIFF" channel 0 channel 0 channel 0 channel 0 xxxx nchannels = 1, wbitsPersample = 8 "Wave" frequency 0 (left) channel 1 (right) channel 0 (left) channel 1 (right) "FMT" nchannels = 2, WbitsPersample = 8 Sizeof (PCMWAVEFORMAT) Struct of PCMWAVEFORMAT channel 0 (low) channel 0 (high) channel 0 (low) channel 0 (high) "data" nChannels = 1, wbitsPersample = 16 xxxx channel 0 (low) channel 0 (high) Channel 0 (low) Channel 0 (high) (low) (low) WAVE FORM DATA NCHANNELS = 2, wbitsPersample = 16 Figure 1 WAV file structure Figure 2 PCM file median schedule first row The mono is 8-bit, the second row represents the double channel 8 bits, and the third row represents a mono 16-bit element, and the fourth row represents the 16-bit element of the double channel. The 8-bit yuan represents the volume size indicated by 8 bits, and the 16-bit element represents the volume size from 16 positions. Theoretically, 8-bit dollars may represent 0 to 255, 16-bit dollars to represent 0 to 65536, but Windows is designed from -32168 to 32167. In addition, it is not necessary to note that 0 does not necessarily represent silence, but is determined by the middle value, that is, 0 when 8-bit metals is 128, and 0 is silent. Therefore, if the program is designed to put a silent data, especially paying special attention to the sound format is 16 or 8-bit yuan to place an appropriate value. 1.2 The specific application of the WAV file information includes the high-rate sampling of the original sound, and in the WAVE_PCM_FORMAT pulse coding format, we can implement in the Visual C program, after reading the WaveHDR file head, the following is the original High-rate sampling information of sound, we can do multi-faceted information processing. 1.2.1 Waveform display. We can display the waveform of the original sound in the way - amplitude mode, which is the simplest and most direct information processing mode. In the time domain range, we can observe whether the signal waveform is continuous, and there is a hop in the middle. 1.2.2 Spectrum Display We can display the spectrum of the original sound in the frequency domain-amplitude, after the fft transformation of the original signal passes through the FFT transformation, and it can be obtained to obtain the energy concentration zone, distribution characteristics, spectrum of the signal. Symmetric coefficient, etc.