H.263 introduction

xiaoxiao2021-03-06  18

H.263 video coding

Introduction to ITU-T H.263 video compression standards: concept, characteristics and implementation

1. Introduction The H.263 standard is published by the International Telecommunications Union (ITU), providing video compression (encoding) for video conferencing and video telecommunications applications. In this guide, we will introduce the concept and characteristics of H.263, and Describe some instances of implementation.

2. Application Video Conference and Video Telecom has a wide range of program applications, including:

Desktop Environment or Interior Conference System Video Communication Electronic Monitoring and Operational Work Medical (Medical Advice and Diagnosis in Works) through Internet or Telephone Lines

In each application, video information (perhaps with audio information) is transmitted through telecommunication communication, including network, telephone line, ISDN, and broadcast form. Video has broadband features (for example, many bytes per second) These applications need to be compressed or encoded to reduce bandwidth values ​​prior to transmission.

3. Video encoding is captured on the source video information frame and encoded via a video encoder. The compressed note is transferred through the network or telecommunications, and decodes in the video decoding end. The decoded frame can be displayed.

4. H.263 System has a lot of video coding standards, each of which is set for specific applications: For example, JPEG is set for static image, MPEG2 is set for digital TV signals, h .261 is set for the ISDN video conferencing system. H.263 is set in special video encoding for low yield (usually only 20-30 kbps or higher)

The H.263 standard indicates the need for video coding and decoder. It does not describe the encoder and decoder itself: Instead, it indicates the format and content of the encoded stream. A typical decoder and encoder are described at this time. We have skipped a lot of H.263 details, such as grammar and encoding mode.

4.1 H.263 Encoder Motion Estimation and Compensation

The first step in reducing the bandwidth is to subtract from the current frame to the previously transmitted frame, which is only the difference or the residual value is encoded and transmitted. This means that the content that has not changed in the frame is not encoded. We achieve higher compression ratio by trying to estimate the movement of the content to the analogy and compensate this moving value. The motion estimation module compares each 16 * 16 pixel block (macroblock) by comparing each 16 * 16 in the current frame and in the preceding frame, and attempts to find a matching frame. The matching area is deleted from the motion compensation module from the current macroblock position. If the motion estimation and compensation process is effective, the remaining macroblock should only contain very little information.

Discrete cosine transform (DCT)

DCT transforms a pixel value (or the remaining frame value) to a series of "time domain" coefficients. This seems to use the fast Fourier transform (FFT) to convert a signal from the time domain to the frequency domain. DCT operates on a two-dimensional pixel block (not a one-dimensional signal), which is particularly longer than the energy in the block to a series of coefficients. This means that through a very small amount of DCT coefficient, we can rebuild a copy of the original pixel block.

Quantify

For a typical pixel block, most of the coefficients obtained with DCT are close to 0. The quantizer module reduces the accuracy of each coefficient, so that the value of approximation is 0, and only some non-zero remains. In the actual operation, we divide the coefficient values ​​through integer level factors and cut off the results. It is important to "throw away" in the quantitative process.

Entropy encoding

A entropy encoder (such as a HUFFMAN encoder) represents a shorter value to be represented by a shorter binary code, and the uncommon value is represented by a long binary code. Entropy coding in H.263 is based on this technology and is used to compress quantization DCT coefficients. This result is a sequence of growth binary. These codes are combined to synchronize and control information (such as the motion vector required to rebuild the motion compensation reference frame) to enter the encoded H.263 code stream.

The frame storage current frame must be stored so that it can be used as a reference frame when the next frame is encoded. We do not simply store the current storage, but to store the quantization factor of the heavy standard, inverse DCT operation after the reverse DCT operation and the reference block information used to reconstruct the motion compensation for motion compensation in the store. This ensures that the contents in the encoder end frame storage area are the same in the contents of the decoder's storage area. When the next frame is encoded, the motion estimates the use of the content in the frame storage area to determine the best matching area of ​​the motion compensation. 4.2 H.263 decoder

Entropy decoding is decoded in order to understand the coefficient value and the motion vector information, which consists of the growth of the H.263 stream.

The re-regulation This is the reverse process of the quantization process: the coefficient is multiplied by a scalar factor similar to the encoded end quantizer. However, because the quantizer discards the small factor, the coefficient value after the re-adjustment is no longer the same as the original coefficient value. .

Inverse DCTIDCT is a reverse transform of the DCT process. It constructs a sample value: they correspond to the difference between the encoder end movement compensation.

Sport compensation

Differences are added to the previous frame to rebuild zone information. Sport vector information is used to select the correct area (the same reference frame is used in the encoder). The result is a rebuilt of raw frame: Note that it will be different from the original frame, Because the quantization process is damaged. This means that the quality of the image is poor than the original frame. The reconstructed frame is placed in one frame storage area, and it is used to exercise compensation for the next received frame.

5. Implementation

5.1 Real-Time Video Communication Develops a video coding and encoder that can work effectively in real-time conditions, there are many problems to be explained, including:

Code rate control

The actual communication channel is limited to the capabilities you can handle every second. In many cases, the code rate is a fixed value (such as POTS, IDSN, etc.)

The foundation of the H.263 encoder is a changed code value for each encoded frame. If the motion estimation / compensation process works fine, then fewer non-0 coefficients are used to encode. However, if motion estimation If you don't work, if you include complex exercise in the video scenario), there will be a lot of non-0 coefficients to be encoded. This value is increased.

In order to map this variable code rate value to a CBR (fixed code rate value) channel, the encoder must perform code rate control. The encoder calculates the code rate output by the encoder. If it is too high, it will increase the scale Factors to increase compression ratios: This results in a higher compression ratio (less code rate), but the decoder gives a lower picture quality. If the rate is lowered, the encoder is lowered by reducing the quantizer. The scale quantization factor is compressed. This will cause more friendship and better picture quality in the decoding end.

Synchronize

The encoder and decoder must be synchronized, especially if the video signal is together with the audio signal .H.263 code stream contains a certain number of "head" or called tag: These are special marks to mark the decoder in the current The location of the frame. If the decoder lost synchronization, then it scans the next tag to resynchronize and restore the decoding. It should be noted that the loss on the synchronization will cause very serious decoding quality. Question. So it is very careful when designing a video coding system in such a transport environment full of "voice".

Audio and multiplexing

The H.263 standard only describes video coding. In many practical problems, audio data must be compressed, transmit and synchronize to video signals. Synchronization, multiplexing, and protocol issues like H.320 (based on ISDN-based video conference) H.324 (POTS-based video telecom) and H.323 (LAN or IP-based video conferencing) Solve. H.263 provides video coding methods for these standards. Audio coding is supported by many standards , Including G.723.1, etc. In addition, the relevant standards contain like multiplexed (H.223) and signaling mechanisms (H.245)

5.2 software implementation

Like motion estimation, functions such as long-chip / decoding and DCT require great processing capability. However, the recent developer development makes it possible to cope with H.263 video in real time in the Pentium level processor. .

A software implementation must be highly optimized to achieve effective video quality (for example, more than 10 frames per second, 352 * 288 pixels per frame). This includes a series of operations, for example, using a fast algorithm at computational intensity Minimize movement or copy operations and unstoppore the loop. In some cases, assembly code will further accelerate operation (such as using Interxivi Directive Set)

5.3 Hardware implementation

For high-definition video or when the powerful processor does not exist, the hardware implementation is this time solution. A typical CODEC will handle the calculated dense part using specialized logic (such as motion estimation) / Compensation, DCT, Quantifier, and Entropy Coding), they use control modules to customize event sequences, and record the encoded decoded parameters. A programmable controller is better because many coding parameters (such as code rate control) Algorithm) can be modified or adjusted by adapting to different environments. Recently, an Intellectual Property center proposes a implementation of H.263. A logical core is a VHDL or Verilog design, which can be combined with other function blocks. Become a part of an ASIC or FPGA. 6. Reference

1.iTU-T Recommendation H.263 "Video Coding for Low Bit Rate Communication"

2. Riley and Richardson, "Digital Video Communications", Artech House 1997 (available from http://www.artech-house.com)

3. Http://www.4i2i.com/H.263 software and hardware implementation.

7. Implementation of H.263 software and hardware

If you want to get the actual implementation of H.263 hardware and software, please click: http://www.4i2i.com/products.htm

转载请注明原文地址:https://www.9cbs.com/read-47333.html

New Post(0)