Next generation compressed video standard performance and its technical characteristics

xiaoxiao2021-03-06  57

Keywords: H.264 Gain Coding With the advancement of economic development and technology, the market's demand for high-performance video services is constantly expanding, the original compressed video standards are no longer able to meet the requirements, and new compressed video technology will have broad Market space. H.264 / AVC is currently developed by the ITU-T video coding expert group (VCEG) and ISO / IEC activity image expert group (MPEG), adapted to a new generation of compressed video standards for low yield transmission. In March 2003, the joint video expert group (JVT) composed of two expert groups announced the final draft of this compressed video standard, which is called ITU-T H.264 protocol or ISO / IEC MPEG-4 Advanced video coding section. Based on protocol content and simulation results, this paper analyzes the main technical features of this protocol, and the basic algorithm is analyzed and the performance comparison of the relevant part is given. The main technical characteristics of the H.264 analysis of the codec frame of H.264 and previously proposed standards, such as H.261, H.263, and MPEG-1/2/4 have no significant change, and is based on mixed encoding: The motion vector represents the moving content of each frame of the image sequence, and the previously decoded frame is used to perform motion estimation and compensation or use intra prediction technique, the resulting image istrial value should be handled, quantified, entropy encoded and other parts. Therefore, the performance of the new standard is improved in the technical solutions of each part and the application of new algorithms. The new standard has made a lot of work in increasing the fault tolerance of image transmission, redefines structural division of the image. At encoding, each portion of the image frame is divided into a plurality of SLICE structures, each SLICE can be independently decoded, and is not affected by other portions. The SLICE consists of the most basic structure-macroblock, each macroblock contains a 16 × 16 brightness block and two 8 × 8 chroma blocks. To further improve robustness, the entire system is divided into video coding layers and network abstraction layers. The video coding layer mainly describes the video content being carried by the video data to be transmitted. The network abstraction layer is considered different applications, such as video conveying, H.32X continuous packages, or communication of RTP / UDP / IP. H.264 standard is divided into three frames: Baseline, Main Profile, and X Profile, representing algorithm sets and techniques for different applications. Baseline mainly contains low complexity, low-delay technical features, mainly for interactive applications, considering the fault tolerance in harsh environments, and content is basically included in other higher levels; Main profile is for higher coding Application of efficiency, such as video broadcasting; X profile design mainly for streaming applications, all fault-tolerant techniques in this framework, the flexible access and switching techniques of compared streams will include them. 1. Baseline's decoder only operates I SLICE and P SLICE for inter-frame prediction, compared to the previous standard, in order to more accurately predict the motion content of the image, the new standard allows the macro block to be further divided into 16 × 16, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, 4 × 4 sub-blocks; motion estimates are accurate to 1/4 pixel position obtained by 6-TAP filters; motion vectors It is predicted from adjacent blocks to which the difference thereof is encoded. H.264 supports prediction of multi-reference frames, the number of reference frames used in the specified motion estimation can reach 15 frames, and the use of multi-reference frame greatly improves the fault tolerance of image transmission, suppressing errors in space and time spread.

For all SLICE encoding types, H.264 supports two types of intraocular encodings: 4 × 4 and 16 × 16 encoding mode. For 4 × 4 modes, each of the brightness 4 × 4 blocks has 8 different directions and DC prediction modes; for 16 × 16 mode, there are 4 intra prediction modes for each 16 × 16 brightness block. For the 8 × 8 chroma sampling of the macroblock, almost the same prediction mode is used in almost 16 × 16. In order to ensure Slice's coding independence, intra prediction is not allowed to span the Slice boundary. For transformation, quantization part, different from the previous criteria for transform coding for predictive values, H.264 uses a simple integer transformation. Such transformations are almost identical and many advantages compared to DCT, and the calculation of their core transformation only uses addition or subtraction, shift operation, and avoids loss of precision. Quantization of the conversion variant is used to use a 52-level sectic quantizer, while the H.263 standard is only 31. The quantitative step is incremented by 12.5%, and the magnitude of the quantitative steps allows the encoder to be more flexible and accurately controlled, and the bit rate and image quality are met. For the entropy coding portion, for the quantization transformation coefficient to be transmitted, when using the context-based beam-based encoding (CAVLC), the deteriorating length code to use according to the quantized transform coefficient value of the previously encoded transmission is selected. table. Since the design of the beam coding table is based on the corresponding statistical conditions, its performance is superior to the use of a single variable length coding table. For other data, such as head information, etc., use a single variable length coded table (Exp-golomb code). The new standard still uses block-based predictions and reconstruction methods, in order to remove block effects that affect image subjective quality, H.264 uses the deck effect filter. Its main thinking is that when the two edges are different on the block boundary, the difference is "smoothed" using the filter; if the image characteristics on the boundary are obvious, filtering is not used. This is not only to weaken the influence of "block effect", and avoid the objective feature of the image, while the bit rate is reduced by 5 to 10% under the same subjective mass. For the organization and transmission of image data, the image macroblock in the H.264 standard can be divided into multiple slice groups (FMO) in the H.264 standard (FMO) to be independent, and can be transmitted to decoding ends in any order. (ASO). In the bitstream, Slice can be transmitted using a repeated method (RS), which can be used to recover in the case of Slice data error, enhance the robustness of image transmission. At the same time, the mutual independence between SLICEs inhibits the spatial propagation of the error and improves the fault tolerance of the bitstream. 2. Main profile The main profile contains all algorithms of the Baseline Profile and has additional technical features, but it does not support FMO, ASO and RS, etc., only support for I, P, B SLICE. This concept is proposed in this framework (ABT) of the size of the assembly block. This concept is for inter-frame encoding, which main idea is to link block sizes encoding the prediction to the prediction to be linked to the block size used to exercise compensation. This makes it possible to transform the encoding as much as possible using the maximum signal length. However, due to complexity, the maximum block size for transform is limited to 8 x 8 or less. The entropy coding portion is encoded more efficiently, and the context-based arithmetic coding (CABAC) is used, and the performance of entropy encoding is further improved. Compared with Cavlc, the encoded TV signal will reduce the bit rate 10 to 15% under the same image quality. In addition, Main Profile does not support multiple SLICE Group divisions. 3. Related Coding Problems How to choose the proposed prediction mode and use motion estimation policy (ME) has always been a key research topic for video coding implementation. In the implementation software of the H.263 standard, the selection of the mode is simple to compare the threshold.

The Lagrang's rate distortion optimization strategy is used in the new standard test software, which is based on the variation of each image block size and each prediction mode, and the code rate thereof. Thus, mode selection can achieve optimized rate distortion performance, but this is at the expense of increasing operation complexity. This optimization operation is the minimization of the Lagrangian function below: j = SATD λ · R, R-corresponds to the bit rate of each part; λ-optimization parameters (strong correlation with quantization parameters) The SATD-the sum of the 4 × 4 blocks of the Hadman transformed. The absolute value sum is absolute. For all frames, the selection of interframe macroblock encoding mode and multi-reference frame is achieved by minimizing the Lagrangon function. Typically, the video standard only includes decoding specifications, and the technical research of mode selection is a coding end, so it is not listed within standards. The performance of two H.264 and other standards is compared to the coding efficiency of H.264, and we compare it to other standards such as MPEG-2, H.263, MPEG-4, etc. Using QCIF, the image sequence in the CIF format test, all encoders use Lagrangi optimization techniques. We use the H.264 test software JM2.0 and the main technical features of Main Profile. The number of reference frames used in H.263 and H.264 is 5, and only the first frame of the encoded image sequence is I frame, and two non-reference frame B is inserted between each 2 reference frame P. The 32 × 32 integer range of motion estimation is performed using a full search manner and is adjusted by a predetermined quantization parameter. The picture shows the test comparison of the image sequence Tempete in the CIF format in the frame rate of 15 Hz. Compared with other criteria, the degree of reducing H.264 in the bit rate is shown in Table 1. To further analyze the technical features of the new standard, we compare the technical performance of the intra coding scheme adopted by the H.264 with the technical performance of the static image coding standard. Here, we use the H.264 test software JM3.9A to compare the test results of the main profile that does not contain ABT technology, the test results of the JPEG 2000 test software VM 9.0. The intra-code technology performance in the H.264 is prominent. For most of the image sequences of most tests, its performance is always exceeded by JPEG 2000 under various bit rate conditions. The reason may be that H.264 has adopted a variety of design reasonable intra prediction modes. On the other hand, H.264 uses wavelet transform technology, grading quantization and arithmetic coding, and does not use prediction techniques. When the high bit rate, the image processed in the subjective quality is not much different. However, at the low bit rate, the image of JPEG 2000 looks more blurred, and the image contour has a significant cyclic effect, which is the result of high frequency component loss using wavelet transform. The test results are shown in Table 2.

From the test results, the intra-code techniques employed by H.264 have a shrinkage of gain when processing large image processing, however there is a high gain in the case of low yield rates. For all image sequences and bit rates, H.264 has a 1.12DB advantage of JPEG 2000. Exhibition of Three H.264 Standards Many people have thought that the traditional block-based video coding technology will be abandoned, but H.264 has once again proved that its advantages in low yard compression video are still very large. Mining potential. The new standard further reflects the adaptability of the video source, but this adaptability is the cost of improving the complexity of the algorithm and increasing the storage capacity of the reference frame. The H.264 standard not only targets the video conferencing system, but also covers the digital storage of TV broadcasts, network streaming media, multimedia information, digital storage, and digital cinema and other applications. In summary, since advanced compression technology, H.264 has excellent video real-time processing performance, will cause another wave of video transmission related technologies, and create huge business opportunities. (Wu Peisen, member of Hui Li Technology Co., Ltd., Hebei Province, Huili Technology Co., Ltd., from "World Broadband Network")

转载请注明原文地址:https://www.9cbs.com/read-82680.html

New Post(0)