BMP, GIF and JPEG file formats and conversion Profile of: Jiang Ping Published: 2000/11/27
Thesis:
This paper mainly discusses the compression method of bitmaps.
text:
BMP, GIF and JPEG file format Introduction and mutual conversion image files are a computer disk file that depicts an image. After forming the digital image data, there are two methods stored in the computer, in place mapping, and vector processing. We mainly discuss the bitmap here. Different image software have almost all kinds of methods to process images, and the image format also varies, it mainly includes file identification heads and image data. The file identification head is used to make the computer judge which file format, the image data contains the entire image depicting data, including a palette, bitmap image, and so on. Different according to the compression algorithm, the image form is also different, and the compression algorithm is briefly introduced below. First, the legacy compression principle is to replace the color value of the color value in a scan line with a count value and the color value of those pixels. For example: AAABCCCCCCDEEE, you can replace 3A1B6C2D3E. For images with large area, the same color area, use the RLE compression method very effective. Many specific stroke compression methods are derived from the RLE principle: 1.PCX stroke compression method: This algorithm is actually a conversion algorithm for bit mapping formats to compression format, which is a 1-time byte CH, if ch> 0xc0 When compressed, add 0xC1 before this byte, otherwise directly output CH, compress the N times from the byte CH, then compress the two bytes of 0xc0 n, CH, so N is only FF-C0 = 3FH ( Decimal 63), when n is greater than 63, it is required to compress multiple times. 2.BI_RLE8 Compression Method: This compression method is used in the bitmap file of Windows. This compression method is also based on two bytes. The first byte specifies the number of colors specified with the second byte. If encoded 0504 indicates that the pixels of 5 color values 04 are continuously displayed from the current position. When the second byte is zero, the second byte has a special meaning: 0 indicates the end of the line; 1 means the end; 2 of the two bytes, the two bytes indicate the next pixel relative to the current position. Horizontal displacement and vertical displacement. This compression method can be compressed by an image pixel bit number of 8 bits (256 colors). 3.BI_RLE compression method: This method is also used in the Windows bitmap file, which is similar to Bi_RLE8 encoding, and the only difference is: Bi_RLE4 contains two pixels, so it can only compress the color number of colors. Image of more than 16. Thus this range of compressed applications is limited. 4. Tightening bit compression method: This method is a bitmap data compression method on Apple's Macintosh machine. This method is used in the TIFF specification. This compression method is similar to the Bi_RLE8 compression method, such as 1C1C1C2132325648 compression to 83 1C 21 81 32 56 48, it is obvious, this compression method is preferably the same for each continuous 128 bytes, which can be compressed into a value 7f. This method is still very effective. Second, Huffman coding compression: is also a commonly used compression method. It is established in 1952 for text files. The basic principle is that frequent data is replaced with shorter code, and the data rarely used is replaced by a longer code, and the code of each data is different. These codes are binary code, and the length of code is variable. Such as: A raw data sequence, ABACCDAA is encoded as A (0), B (10), C (110), (D111), and is 010011011011100 after compression. Generating Hoffman coding requires two times to the original data, the first time scan to accurately count the frequency of each value in the raw data, the second time is to establish a Huffman tree and encode, because it is necessary to establish a binary tree And travers the binary tree to generate a code, so the data compression and restore speed are slow, but it is simple and effective, and thus a wide range of applications.
Third, LZW compression method LZW compression technology is complicated than most other compression techniques, and the compression efficiency is also high. The basic principle is to encode each of the first appearance character strings, and then replace this value into the original string, such as using the value 0x100 instead of the string "abccddeee" whenever this When the string is used, it uses 0x100 instead of compression. As for the correspondence between 0x100 and strings, it is dynamically generated during the compression process, and this correspondence is implicit in compressed data, and the decompression is gradually recovered from the compressed data, The latter compressed data will produce more correspondence based on the corresponding relationship generated by the previous data. Until the compressed file ends. LZW is reversible, all information is all reserved. Fourth, the arithmetic compression method is similar to the Huffman coding compression method, but it is more effective than Hoffman coding. Arithmetic compression is suitable for files consisting of the same repeating sequence, arithmetic compression proximate the theoretical limit of compression. This method is to bring different sequences to the area between 0 and 1, which represents a binary fraction of a variable precision (bit), the more uncommon data, the higher the accuracy of data (more Number, this method is more complicated and is less common. V. JPEG (Joint Photographic Exprerts Group) JPEG standard is different from other standards, which defines an incompatible encoding method. In its most common mode, it is disturbed, one recovered from JPEG files. The image is always different from the original image, but the image after lossless compression is often better than the original image. Another significant feature of JPEG is that its compression ratio is relatively high, the original image size is from 1% to 80 to 90% compared to the size of the image after the compressed image. This method is also suitable for multimedia systems. Introduce the compression algorithm, let's briefly introduce the three bitmap formats and the mutual conversion between them. 1. BMP Image · BitmapHeader Data Structure · BitmapInfo Data Structure · Bitmap Array 1) The bitmap file header data contains the type of BMP image file, display content and other information. Typedef struct {int bftype; // must be "bm" long bfsize; // bit map size int bfreserved1; // must be "0" int bfreserved2; // must be "0" long bfoffbits; // bitmap array starting position} BITMAPEFILEHEADER; 2) data structures of the bitmap information and RGBQUAD BITMAPINFOHEADER two data structures, typedef struct {BITMAPINFOHEADER bmiHeader; RGBQUAD bmiColors [];} BITMAPINFO BITMAPINFOHEADER wherein the data structure contains about BMP image width, high , Compression method and other information. The data structure RGBQUAD defines a color. 3) The bitmap array bitmap array records each pixel value of the image. Grade line scan image from the lower left corner of the image. From left to right, from top to bottom, the pixel value of the image is recorded one by one, and the byte of these record pixel values constitutes a bitmap array. The storage format of bitmap array data has compression and non-compressed two formats. 1. The pixel value of each point in the non-compressed format bitmap corresponds to several bits of the bitmap array, and several bits of the bitmap array are determined by the number of images of the image. 2. Compression Format In the BMP format file, Windows supports two compression types of Bi-RLE8 and BI-RLE4.
2, GIF image file format GIF's full name is Graphics Interchange Format, which is translated into a graphics exchange format. GIF is a public image file format standard, but it is owned by Compuserve. The GIF file structure contains a file header, which first encounters GIF flags in a GIF file, tells the decoder this is a GIF file. This logo is 3 bytes of string: GIF. A multi-image can be stored in a GIF file, but most of the files contain only one image. Then, the screen description word (Screen Descriptor) illustrates the display resolution of the image used to generate the image in the display file, indicating the width and high of the screen, respectively. A byte that keeps followed is a global logo, and the low three indicates how many colors that are about to encounter. The highest bit indicates whether there is a global color table. The background colors set the background to make the appropriate color, actually a number pointing to the global color table. Struct global_data {unsigned short screen_width; unsigned short screen_height; unsigned char background; har tail = '/ 0';} Next is a global color table, store all seriously, each of which is serious by a color table Each item is 3 bytes, indicating the strength of the three primary colors of red, green and blue, respectively. Its length has a low three indication of the global logo. The subsequent data is partial. It is a collection of data blocks. The following is the structure of the image data block. Struct Local_Head {Char heading = ','; Unsigned short image_left; Unsigned short image_top // start position on a screen image displayed; Unsigned short image_width; Unsigned short image-height; Unsigned char local_flag; //} topical local flag The difference between the logo and the global flag is in the second high, and if this is set to 1, the bitmap data indicating the image is stored in an interlaced. That is, in the unlocked bitmap data, the first row is stored in the first line on the screen, the second line corresponding to the screen, the third line corresponding to the first line, to increase - This is the first time scan; secondary scan is starting from the 5th line on the screen, and between the two lines is incremented by 8; the third time is starting from the third line of the screen, between the two lines Increasing 4, the fourth time, the last time, starting from the 2nd line, two lines are incremented by 2. The correspondence between the image data and sequential (unclatted) stored image data is shown below: The GIF image that is interlaced can be divided into four passages when displayed while decoding. In the first time, although only one-third of the entire image is displayed, only one quarter is displayed after the scan of the entire image, but this has already displayed the overview of the entire image. When the GIF image is displayed, the image that is interlaced will give you an impression: It seems to be faster than other images, which is the advantage of interlacing. The LZW compression algorithm is used in the codec of the GIF image - converting these characters into another form of code stream, and the decoding process is to restore this code stream to the original character stream. 3, JPEG Image File Format JPEG is the first letter of Joint Photography Experts Group (Joint Photography Expert Group). The main role of JPEG is the standard coding technology for digitized image. The JPEG image file is a pixel format file format, but it is more complicated than image files such as GIF, BMP.