First, MMX Technical Introduction
Intel's MMX (Multimedia Enhancement Instruction Set) technology can greatly improve the application ability of the application to two-dimensional three-dimensional graphics and images. Intel MMX technology can be used to complex processing for large amounts of data and complex arrays, using MMX technology can be processed by the basic units, or word, or double word (double-word).
Visual Studio .NET 2003 provides support for the MMX instruction set features, so you don't have to write assembly code, you can implement the function of the MMX instruction directly using C code. By referring to Intel Software Manuals [1] and the theme of MMX programming technology will make you better grasp the key points of MMX programming.
MMX technology implements the execution mode of single-channel command multi-channel data stream (SIMD, Single-INSTRUCTION, MULTIPLE-DATA). Consider the following tasks that need to be programmed, in a byte array, make each of the elements, plus one number, in the traditional program, the algorithm for implementing this function is as follows:
For each b in array // Each element B {b = b n in the array; // plus a number n}
Let's take a look at its implementation details:
The for Each B IN Array // Aligned each element B {load B into the register into the number in the register back the result in the resulting register back to memory}
The processor with the MMX instruction set support has eight 64-bit registers, each register can store 8 bytes, 4 words (Word) or 2 double words (Double-Word). The MMX technology also provides an MMX instruction set, and the instructions can be loaded into these MMX registers in these MMX registers in the registers, and the registers can be used in the registers. The result is put back into the memory storage unit. The above example of the algorithm after MMX technology is like this:
For each 8 members in array // 8 bytes in the array (one byte in one byte) as a set of extract {load this 8 bytes into the MMX register through a CPU instruction execution cycle Plus 8 bytes in this register, write the results calculated in the register back memory}
C programmers do not have to use instructions in the MMX instruction set directly to access these MMX registers. You can use 64-bit data type __m64 and a series of C functions to perform related arithmetic and logical operations. Which MMX register and code optimization are the task of the C compiler.
Visual C MMXSWARM [4] is an example of a good use of MMX technology in MSDN that contains some encapsulated classes simplified the operation of using MMX technology and showing you to various Different format images are processed (such as monochrome 24-bit pixels RGB, 32-bit pixel RGB, etc.). This article is just a brief introduction to the MMX programming using Visual C . If you are interested, you can see the example of MSDN on MSDN.
Second, MMX programming details
1. The header containing the header
All MMX instruction set functions are defined in the emmintrin.h file:
#include
Because the MMX processor instruction used in the program is determined by the compiler, it is not related to the related .lib library file.
2 .__ m64 data type
This type of variable can be used as an operand of the MMX instruction, which cannot be accessed directly. The _m64 type variable is automatically assigned to the word length of 8 bytes.
3.CPU Support for the MMX instruction set If your CPU can have a MMX instruction set, you can use the C function library supported by the MMX instruction set to the MMX instruction set, you can view a Visual in MSDN. An example of C CPUID [3], it can help you detect if your CPU supports SSE, MMX instruction sets, or other CPU functions.
4. Saturation Arithmetic and Package Mode (Wraparound Mode)
MMX technology supports a computing mode called Saturating Arithmetic (saturation algorithm). In saturation mode, when the calculation result is overflow (overflow or underflow), the CPU automatically removes the overflowed portion, allowing the calculation result to demonstrate the data type represents the upper limit value of the value (if overflow) or the lower limit value ( If underflow). The calculation of saturation mode is used to process the image.
The following example can make you understand the difference between saturation mode and packaging mode. If a byte (byte) type variable is 255, then add a value to one. In the package mode, the result is 0 (removed); in saturation mode, the result is 255. Saturated mode is treated with a similar method, for example, for a number of byte data types in saturation mode, 1 minus 2 results are 0 (rather than -1). Each MMX arithmetic directive has both modes: saturation mode and package mode. The items you want to discuss this article use only MMX instructions in saturation mode.
Third, MMX8 demo project
MMX8 is a single document interface (SDI) application for simple processing of monochrome bitmaps per pixel 8 bits. The image of the source image and the post-processed image will be displayed in the form. The new ATL (active template library) class CIMAGE is used to extract images from the resource and display it in the form. The program is to perform two processing operations: image color inverting and changing the brightness of the image. Each processing operation can be implemented in one of the following methods:
Pure C code; use C MMX functional code; use the code of the MMX assembly instruction.
The time to process the image will be displayed in the status bar.
Image color inverted function with pure C implementation:
Void Cimg8Operations :: InvertImageCplusplus (byte * psource, byte * pdest, int nnumberofpixels) {for (int i = 0; i In order to query the method of using the C MMX instruction function, you need to refer to the intel software manual (Intel Software Manuals) for the instructions for MMX assembly instructions. First of all, I found a general introduction to the MMX-related instruction in the first volume, and then The second volume finds a detailed description of these MMX instructions, and some of these descriptions involve C functions related to their characteristics. Then I find the C function corresponding to these MMX instructions to find out the description related to it in MSDN. The MMX instructions used in the MMX8 sample program and the related C functions are shown in the table below: The implementable functionality corresponding to the MMX assembly instruction Visual C . The MMX function in the NET clears the content in the MMX register, that is, initialization (with the avoidance and floating point number operations) EMMS_MM_EMPTY will correspond to the two 64 digits (8) Symbol (8-bit) byte Simultaneous subtraction operation PSUBUBUSB_MM_SUBS_PU 8 The corresponding (8) no symbol (8-bit) bytes in the two 64 digits are simultaneously operated by adding PADDUSB_MM_ADDS_PU8 with Visual C . NET MMX instruction function Implement image Color inverted function: void CImg8Operations :: InvertImageC_MMX (BYTE * pSource, BYTE * pDest, int nNumberOfPixels) {__int64 i = 0; i = ~ i; // 0xffffffffffffffff int nLoop = nNumberOfPixels / 8; // 8 pixels per cycle __m64 * PIN = (__m64 *) PSource; // Input byte array pointer __m64 * pout = (__m64 *) PDEST; // output byte array pointer __m64 TMP; // temporary work variable _mm_empty (); // Perform MMX Directive: EMMS, initialization MMX register __m64 n1 = get_m64 (i); for (int i = 0; i Although this function is executed very short time, I record these three ways to take time, the following is the result of running on my computer: Pure C code 43 milliseconds using C MMX instruction function code 26 milliseconds using MMX assembly instructions code 26 ms The above image processing time must be implemented when the program release is optimized after compiling. I use the simplest way to change the brightness of the image: add or subtract the color value of each pixel in the image. Such a conversion function is somewhat complicated relative to the previous processing function, because we need to divide the processing process into two cases, one is to increase the pixel color value, and the other is to reduce the pixel color value. Change the function of brightness with pure C functions: void CImg8Operations :: ChangeBrightnessCPlusPlus (BYTE * pSource, BYTE * pDest, int nNumberOfPixels, int nChange) {if (nChange> 255) nChange = 255; else if (nChange <-255) nChange = -255; BYTE b = (BYTE) Abs (nchange); INT i, n; if (nchange> 0) // increases pixel color value {for (i = 0; i void CImg8Operations :: ChangeBrightnessC_MMX (BYTE * pSource, BYTE * pDest, int nNumberOfPixels, int nChange) {if (nChange> 255) nChange = 255; else if (nChange <-255) nChange = -255; BYTE b = (BYTE) Abs (nchange); __INT64 C = B; for (INT i = 1; i <= 7; i ) {c = c << 8; c | = b;} int nNumberOfloops = nnumberofpixels / 8; // In a cycle Treatment 8 pixels __m64 * pin = (__m64 *) PSource; // Input byte array __m64 * pout = (__m64 *) PDEST; // output byte array __m64 TMP; // temporary work variable _mm_empty ); // Perform mmx instruction: EMMS __M64 nChange64 = GET_M64 (C); if (nchange> 0) {for (i = 0; i Pure C code 49 milliseconds using C MMX instruction function code 26 milliseconds using MMX assembly instruction code 26 ms Fourth, MMX32 demo project The MMX32 project can process the RGB image of 32-bit pixels. The image processing of the image is the image color inverting operation and the balance of the image color (each color of the pixel point is multiplied by a certain value). MMX multiplication is much more complicated than plus subtraction, because the number of bits of the result of the multiplication operation is no longer the size of the previous bits. For example, if the number of operations of the multiplication has a byte (8-bit BYTE) size, the result will reach a word (16-bit Word) size. This requires additional conversion and uses the MMX assembly instructions and C code to convert image conversions. Time is not very large (the time difference is 5-10%). Using Visual C function to change the color balance of the image NET function of the MMX instructions implemented:. Void CImg32Operations :: ColorsC_MMX (BYTE * pSource, BYTE * pDest, int nNumberOfPixels, float fRedCoefficient, float fGreenCoefficient, float fBlueCoefficient) {int nRed = (int); int ngreen = (int)); int NBLUE = (int) (FBLUECOEFFICIENT * 256.0F); // Set multiplial coefficient __int64 c = 0; c = nred ; C = c << 16; c | = ngreen; c = c << 16; c | = NBLUE; __M64 nnull = _m_from_int (0); // null __m64 tmp = _m_from_int (0); // temporary work temporary variable Initialization_mm_empty (); // Clear the MMX register. __m64 ncoeff = GET_M64 (C); DWORD * PIN = (DWORD *) PSource; // Enter Dual Group DWORD * Pout = (DWORD *) PDEST; / / Output Double word array for (int i = 0; i You can see the source code for the sample project to learn more about this project. V. SSE2 technology The SSE2 technology includes a set of instructions similar to the integer operation in MMX, and also contains 128-bit SSE register groups. For example, the use of SSE2 technology to change image color balance can be achieved more efficient than using pure C code. SSE2 is also an extension of SSE technology, such as it can not only a single-precision floating point count, but also to process an array of bid-precision floating point data types. The MMXSWARM sample item implemented with C not only uses the MMX instruction function, but also the function of the SSE2 instruction on the integer operation. 6. Reference documentation Intel Software Manuals: http: //developer.intel.com/design/archives/processors/mmx/index.htm. MSDN's topic for MMX technology: http://msdn.microsoft.com/library/default.asp? URL = / library / en-us / vclang / html / vcrefsupportformmxtechnology.asp. Microsoft Visual C CPUID Project Example: http://msdn.microsoft.com/library/default.asp? URL = / library / en-us / vcsample / html / vcsamcpuiddeterminecpuCapabilities.asp. Microsoft Visual C MMXSwarm Project example: http: //msdn.microsoft.com/library/default.asp url = / library / en-us / vcsample / html / vcsamMMXSwarmSampleDemonstratesCImageVisualCsMMXSupport.asp?. Matt Pietrek in Microsoft Systems Journal issued by the comments: http://www.microsoft.com/msj/0298/HOOD0298.ASPX. Original author: Alex Farber original source: http: //www.codeproject.com/cpp/mmxintro.asp programming examples: http: //www.codeproject.com/cpp/mmxintro/MMX_src.zip translation Source: http: // Blog.9cbs.net/hifrog/archive/2004/02/01/21644.aspx Format finishing: http: //yonsm.reg365.com/index.php? Job = art & articleid = a_20041008_204042