Image format introduction
Kerberos
Software Engineer, Blue Point Software Beijing R & D Department February 2001
Many applications require image information on the Internet. These image information is usually saved in a specific format, commonly available with GIF, JPEG, PNG, and the like. Due to the particularity of various image file formats, it has greatly increased the difficulty of programming. In order to help developers understand and process these image information, this article describes the common graphics formats and related concepts, and introduces how to use the existing image processing function libraries in the Linux system. The full text is divided into two parts. The first part introduces three image formats commonly found on the Internet: GIF, JPEG, PNG; the second part describes the programming processing of these images.
1 Introduction With the development of the Internet, information exchange has also gradually evolved from the original text to multimedia information exchange of text, image, sound, video. It can be said without exaggeration, the importance of the image throughout the information is second only to the text. Since the volume of the image is very large, the network bandwidth limit is added, so that the transmission technology of the image is quickly developed. Now, the popular image format on the network is three kinds of JPEG, PNG, GIF.
The easiest, most common simultaneous image file format is also a bitmap, a bitmap file records the width, length, color depth of the image, and the color of each pixel represented by RGB. The bitmap is a basic image format, so on many platforms, including Windows, X WINDOW, and Macos provide a large number of processing bitmaps. Since the bitmap describes the color of a pixel point using the RGB mode, the bitmap is very large, and relative to a limited network bandwidth, a pair of bitmaps are transferred, and there is a lot of network resources. Of course, you may say, why not use compression algorithm compression to transfer? Very good, using compression technology can indeed reduced the volume of the image, and the compressed technology is used in the format of several images we introduced. However, the way to describe a pixel based on RGB color, which has caused the introduction of the image of the bitmap, which also uses compression techniques, and many color representations are better than RGB.
For the expression of color, a very simple method is to use a color index, the so-called color index is provided in the file of the image, providing one or more palettes in the file, and the palette uses the RGB mode to record all of the images. Color, the color of the pixel in the image is represented by the index of the palette. In this way, the volume of the image file is greatly reduced, and the compression technology, the volume of the image can also be reduced.
The compression, lossless compression and lossless compression used in the image, the compressed image, the compression ratio is relatively large, but the image is distortion, because it uses the color similar to the pixel to replace the truth of the pixel The color, such saving color space, JPEG image is a typical lossy compressed image with a compression ratio of 20: 1. Lossless compression is only the data after compression compression of the data of the color of the storage pixels, is also required to reduce the decoder. Some such algorithms are used in image formats such as GIF, PNG.
2 Introduction to Image Format
2.1gif
The full name of the GIF image format developed by COMPUSERVE is 'GraphicsInterchangeformat', as the name suggests, is transmitted to the Internet to transmit high quality images. It has two versions: GIF87A and GIF89A. In the GIF image format, all data is transmitted in a stream, and GIF defines a number of separators to divide the data stream into a data block. GIF data blocks have the following:
GIF file signature block: GIF image identifier, including GIF's signature identity and version of GIF file format. Logical Screen Description Data Block: Describes the size of the logical screen, including the left upper corner coordinate, length, and width of the logical screen represented by logical pixels. Global palette data block; including a global palette described in RGB mode. Partial palette data blocks; including a partial palette that is applied to a partial one by RGB. Image Description Block: The position and length and width of this frame image indicated by logical pixels in the logical screen. Grating data blocks; including specific image data, each pixel color index in the palette. The data stream ends the data block; the end of the flag GIF data stream is also increased in the GIF89A version.
Image extension control data block; the control flag of each frame of the GIF image, the scanning method containing the image is sequential scan or interleaved scan. Note Data Blocks; Note Text Text Data Blocks; contains text applications in the image to customize data blocks; contains data defined by the application.
The GIF image can support up to 256 colors, and the grating data is compressed using the LZW compression algorithm in the raster data data block. In the image control data block, it is also defined whether the scanning order of the GIF image is a staggered scan, so that a GIF image allows a GIF image to display the primary portion before the network transmission speed is relatively slow.
For processing of GIF images, we need to write a decoder to handle different data blocks to convert it into a format that makes it easy to apply. The decoder is written is very complicated, but there is already a lot of convenient and free processing libraries under Linux, and libgif is the leader. We can use these functions directly. Behind it will introduce the use of libgif. Or if you have special needs, you can write a decoder yourself, but usually, there is nothing necessary.
2.2png
Although GIF is a very excellent image format, it has led a lot of disputes on the network because of the copyright issues involved in Compuserve. So another image format appears, that is, PNG. The full name of PNG is 'Portable Network Graphics', in addition to free use, PNG has many functions than GIF, such as supporting 16-bit 24-bit color space, Chanel, Alpha, GAMA processing, etc.
PNG is also transmitted in a data stream, similar to GIF, and PNG also decomposes data streams into several data blocks. The beginning of the PNG data stream is a signature of the PNG image format. The order of the subsequent data blocks is irrelevant to the order of dependence due to dependence. At each PNG image file stream is the file signature of the PNG, then the IHDR data block, and there are some other data blocks, and the PNG file is finally the IEND-file ending the data block.
The structure of each PNG data block is as follows:
Data block length field; data block type field; data block content field; CRC check code
The data block length field only refers to the number of bytes of the data block, and does not include its own length of the data block type field and the CRC check code field. The length of these fields is fixed 4 byte. It should be noted that the data block type field, this field is composed of four ASCII characters, and each ACSII character has different meaning. The PNG decoder needs to select a different way of processing according to the case where there is no character. The upper case of the first character indicates that this data block is an important data block. If the decoder cannot identify this type, the decoding failed. Conversely, lowercase is a non-important data block, and the decoder can choose to ignore this data block. The second character is case reflects whether this data block is a public data block of the PNG image, the third is the reserved flag, which is always uppercase characters, and the upper number of the fourth character indicates whether this data can be arbitrary copy. The following is the main PNG image data block type: important data block: iHDR: Describe the size / color space, compression algorithm / row scanning mode of the PNG image, generally appearing after the PNG file signature data block. PLTE: Contains the palette used in the image data in the image data block IDAT, where the color index (which is specified in the IHDR data block), must appear before the Idat data block. IDAT: The data of the image, the specific representation is specified in IHDR, and an image can be composed of several IDAT data blocks, but these data blocks must be continuous. IEND: The end of the image appears in the finals of the file. This data block has no actual data segment.
Non-important data blocks:
BKGD: Background color of the image; CHRM: Displays the correction data block for implementing the device independence. GAMA: GAMA Special Display Data Block Hist: When used to simulate the color PHYS: recommended display environment SBIT: Special pixel data TEXT: Some text in the image Time: Image last modification time trns: Transparent Treat ZTXT: Compressed Text
Compared with GIF, PNG is more powerful and more complicated. Of course, there is a libPng library under Linux, which we can use it. Later, we will also introduce the use of libpng.
2.3JPEG
Unlike the previous two image formats, JPEG is an image of a lossless compression. It is a file format specified by the Static Image Expert Panel, with an image of a high compression ratio. JPEG is also a short written in a static image expert group.
Different in the previous description of the two images, the color space used by JPEG is YCRCB space, similar to the YUV12 color space used by MPEG. Since the human eye is sensitive to the brightness of the brightness, the use of YCRCB color space can be compressed in the case where the visual effect does not affect the visual effect.
The conversion matrix of YCRCB to RGB is
| Y | 0.299 0.5870.114 || r | | 0 |
| CB | = | - 0.1687- 0.3313 0.5 | * | g | | 128 |
| Cr | 0.5 - 0.4187- 0.0813 || B | | 128 |
The format of the JPEG image is also described in each data segment. At the beginning of the JPEG file, it is the SOF data flag, and the file ends at the end of the EOF flag. Each data segment begins with an 0xFF, and finally ends with a non-zero value or 0xFF. Different data segments have expressed the meaning of the data segment. There are several more ways:
FFD8: SOF data segment, includes information of the accuracy, size, length, etc. of the image FFD9: EOF data segment, indicating that the file ends RST1 ~ RST7, FFD1 ~ FFD7: Reset Data Sign FFE0: Reserved Data Block FFFE: File Note Data Block FFDB : Quantization Table Data Block FFDC: Image Route Data Block FFDD: Reset Interval Data Block FFC4: HAFFMAN Table Data Block JPEG Coding Method is the most complex in the above three image formats, which use many complex mathematical operations Fortunately, even if you don't know what you have, as long as you have some understanding of the format of the image, you can also handle it in the program, of course, use the licensed library -Libjpeg.
In the second part of this article, we will introduce three of the three functions of the above three images on Linux, libgif, libpng, and libjpeg.
Reference
Image File Format Information Wotsit has a detailed description of many file formats
Detailed description of many file formats on WOTSIT
About the author Kerberos, software engineers, worked in the Software Technology (Beijing) R & D Center. Welcome readers and I get in touch (kerberos@miniGui.org).