Member function pointer and high performance C ++ entrusted (middle)

zhaozj2021-02-16  53

Member function pointer and high performance C entrusted (middle)

MEMBER FUNCTION POINTERS AND THE FASTEST POSSIBLE C Delegates

Written: Don Clugston

Translation: Zhou Xiang (connected)

Member function pointer - Why is it so complicated?

The member functions of the class and the standard C function are different. Similar to the parameters explicitly declared, the member function of the class has a hidden parameter this, which points to an instance of a class. Depending on the compiler, this or a normal parameter is considered inside, or will be specially treated (for example, in VC , this is generally passed through the ECX register, and the parameters of the ordinary member function are directly pressed in the stack in). This is the essence of the parameters and other ordinary parameters. Even if a member function is dominated by a normal function, there is no reason to make this member function and other ordinary functions in standard C , because no ThisCall keyword guarantees that it uses the normal parameters as normal call rules. The member function is a matter, the normal function is another thing (MEMBER FUNCTIONS ARE from Venus).

You may guess, like a member function pointer and a normal function pointer, just a code pointer. However, this guess may be wrong. In most compilers, a member function pointer is much larger than a normal function pointer. Surprisingly, in Visual C , a member function pointer can be 4, 8, 12 or even 16 bytes long, depending on the nature of the class it related, but also depends on what kind of compilation used by the compiler. Set! Member function pointer is much more complicated than you imagined, but it is not always like this.

Let us return to the early 1980s, then, the oldest C compiler cfront has just been developed, then the C language can only achieve single inheritance, and the member function pointer is just introduced, and they are very simple: they are like Ordinary function pointers, only additional this as their first parameters, you can convert a member function pointer into a normal function pointer and enable you to pay sufficient attention to this additional parameter.

This idyllic world has been smashed with CFRONT 2.0. It introduces template and multiple inheritance, and the damage caused by multiple inherits causes a change in member function pointer. The problem is that with multiple inheritance, before calling, you don't know which parent class pointer, for example, you have 4 classes defined as follows:

Class a {

PUBLIC:

Virtual int AFUNC () {RETURN 2;

}

Class b {

PUBLIC:

INT bfunc () {return 3;

}

// c is a single inheritance class, which is only inherited in A

Class C: public a {

PUBLIC:

INT cfunc () {return 4;

}

// D class uses multiple inheritance

Class D: Public A, Public B {

PUBLIC:

INT DFUNC () {RETURN 5;

}

If we have established a member function pointer of Class C. In this example, both AFUNC and CFUNC are members of C, so our member function pointer can point to AFUNC or CFUNC. But AFUNC requires a THIS pointer to C :: A (later I call it athis), and cfunc requires a THIS pointer to c (later I call it Cthis). The compiler designer uses a trick (Trick) to handle this situation: they guarantee that the Class A is physically stored in the head of Class C (that is, the Class C start address is an example of a class A). The start address), which means ATHIS == CTHIS. We only need to worry about a THIS pointer is enough, and for the current situation, all issues can be handled. Now, if we create a member function pointer of a class D class. In this case, our member function pointer can point to AFUNC, BFUNC or DFUNC. But AFUNC requires a THIS pointer to D :: A, while BFUNC requires a THIS pointer points to D :: B. At this time, this trick doesn't care, we can't put the A and C class in the head of the D class. Therefore, one member function pointer of class D is not only to indicate which function is to specify the call, but also specify which THIS pointer to use. The compiler knows how big is the space occupied by Class A, so it can add an ATHIS pointer to the BTHIS pointer to ATHIS.

If you use virtual inheritance, such as virtual base classes, the situation will become worse, you don't have to understand why this is too hurt. For example, the compiler uses the virtual function table (Virtual Function Table - "VTable") to save each virtual function, the address of the function and the Virtual_Delta: convert the current THIS pointer to the THIS pointer required by the actual function. The displacement amount to be added at the time.

In summary, in order to support the general form of member function pointer, you need to at least three information: the address of the function, you need to increase the Delta bit shift on the THIS pointer, and the index in a virtual function table. For MSVC, you need a fourth message: The address of the virtual function table (VTABLE).

Realization of member function pointer

So, how does the compiler implement a member function pointer? Here is a compiler for different 32, 64 and 16 bits, for a variety of different data types (with int, void * data pointers, code pointers (such as pointers pointing to the static function), inherited in a single (SINGLE-), Multiple-) Inheritance, Virtual-) Inheritance and Unknown The membership function pointer under the inheritance of the unknown type (unknown) uses the SIZEOF operator to calculate the data obtained:

Compiler Option Int DataPtr Codeptr Single Multi Virtual Unknown MSVC 4 4 4 4 8 12 16 MSVC / VMG 4 4 4 16 # 16 # 16 # 16 MSVC / VMG / VMM 4 4 4 8 # 8 # - 8 # intel_ia32 4 4 4 4 4 1212 Intel_ia32 / VMG / VMM 4 4 4 4 8 - 8 Intel_itanium 4 8 8 12 20 20 G 4 4 4 8 8 8 Comeau 4 4 4 8 8 8 DMC 4 4 4 4 4 4 4 BCC32 4 4 12 12 12 12 12 12 12 WCL386 4 4 4 12 12 12 12 12 12 12 12 XLC 4 8 8 20 20 20 20 DMC SMALL 2 2 2 2 2 2 2 2 2 DMC Medium 2 2 4 4 4 4 WCL Small 2 2 2 6 6 6 6 WCL Compact 2 4 2 6 6 6 WCL Medium 2 2 4 8 8 8 WCL Large 2 4 4 8 8 8 8 Note: # Represents 4, 8 or 12 when using the __single / __ulti / __ virtual_inheritance keyword. These compilers are Microsoft Visual C 4.0 to 7.1 (.NET 2003), GNU G 3.2 (Mingw Binaries, http://www.mingw.org/), Borland BCB 5.1 (http://www.borland.com/) Open Watcom (WCL) 1.2 (http://www.openwatcom.org/), DIGITAL MARS (DMC) 8.38N (http://www.digitalmars.com/), Intel C 8.0 for Windows IA-32, Intel C 8.0 for itanium, (http://www.intel.com/), IBM XLC for AIX (Power, PowerPC), Metrowerks Code Warrior 9.1 for Windows (http://www.metrowerks.com/), and Comeau C 4.3 (http://www.comeaucomputing.com/). Comeau's data is obtained on its supported 32-bit platform (X86, Alpha, SPARC, etc.). The 16-bit compiler is tested under four DOS configurations (Tiny, Compact, MEDIUM, and LARGE), it is used to display a variety of different code and data pointers. The MSVC is tested under / vmg, used to display all the features of the member pointer. (If you have a compiler in the list, please let me know. The compiler test results under the non-x86 processor have a unique value.) Look at the data in the table, do you think is very surprised? You can clearly see the code written in some environments and cannot run in other compilers. Between different compilers, their internal implementation is obviously very different; in fact, I think the compiler does not have a significant difference in other characteristics of the language. Studying the details of the implementation You will find some strange problems. Generally, the compiler takes the worst, and has always used the most common form. For example, for the following structure:

// borland (default) and Watcom C .

Struct {

FunctionPointer m_func_address; int m_delta;

INT M_VTABLE_INDEX; // If it is not a virtual inheritance, this value is 0.

}

// MetrowerKs CodeWarrior uses a slightly different way.

// Even if you do not allow multiple inherited Embedded C mode, it also uses this structure!

Struct {

Int m_delta;

INT M_VTABLE_INDEX; // If it is not a virtual inheritance, this value is -1.

FunctionPointer M_Func_Address;

}

// An early Suncc version obviously uses another rule:

Struct {

INT M_VTABLE_INDEX; // If it is a non-virtual function, this value is 0.

FunctionPointer m_func_address; // If it is a virtual function (Virtual function), this value is 0.

Int m_delta;

}

// The following is the method of using Microsoft's compiler in the case of unknown inheritance or using / vmg option:

Struct {

FunctionPointer M_Func_Address;

Int m_delta;

INT M_VTORDISP;

INT M_VTABLE_INDEX; // If it is not a virtual inheritance, this value is 0

}

// Aix (PowerPC) on IBM XLC compiler:

Struct {

FunctionPointer m_func_address; // is 64-bit for PowerPC

INT M_VTABLE_INDEX;

Int m_delta;

INT M_VTORDISP;

}

// GNU G uses a method of spatial optimization

Struct {

Union {

FunctionPointer M_FUNC_ADDRESS; // The value of its value is always 4 times

INT M_VTABLE_INDEX_2; // The result is always odd by 2

}

Int m_delta;

}

For almost all compilers, Delta and Vindex are used to adjust the THIS pointer passed to the function, such as the Borland's calculation method is:

AdjustedThis = * (this vindex -1) delta // if vindex! = 0

AdjustedThis = this delta // if vindex = 0

(Where "*" is to extract the value in this address, AdjustedThis is the adjusted THIS pointer - translator's note)

Borland uses an optimization method: If this class is single inherited, the compiler knows that the value of Delta and VINDEX is 0, so it can skip the above calculation method.

The GNU compiler uses a strange optimization method. It can be clearly seen that for multiple inheritance, you must view the VTABLE to get the Voffset (virtual function offset address) to calculate the THIS pointer. When you do these things, you may also save the function pointer in the VTABLE. Through these work, the compiler combines m_func_address and m_vtable_index to one (ie, in one Union), the compiler distinguishes the two variables of these two variables to make the value of the function pointer (m_func_address) to two after the result is an even number, and The virtual function table index (m_vtable_index_2) is odd in divided by 2 results. Their calculation method is: adjustedthis = this delta

IF (Funcadr & 1) // If it is an odd number

Call (* (* Delta (VINDEX 1) / 2) 4)

ELSE // If it is an even number

Call Funcadr

(Where funcadr is the result of the function address divided by 2. - Translator Note)

Inter-ITANIUM compiler (but not their x86 compiler) also uses the unknown_inheritance structure for virtual inheritance, so a virtual inheritance pointer has 20-byte size, not 16 bytes of imagination. .

// What is under Itanium, Unknown and Virtual Inheritance.

Struct {

FunctionPointer M_Func_Address; // is 64-bit for Itanium

Int m_delta;

INT M_VTABLE_INDEX;

INT M_VTORDISP;

}

I can't guarantee that Comeau C uses the same technique as GNU, nor does it guarantee whether they use short instead of INT to narrow the size of this virtual function pointer to 8 bytes. The recently released Comeau C version uses Microsoft's compiler keyword (I think it is just ignoring these keywords without ignoring these keywords).

The Digital Mars compiler (ie, the initial Zortech C to Symantec C ) uses a different optimization method. For a single inheritance class, a member function pointer is only the address of this function. However, when more complex inheritors, this member function pointer points to a formal conversion function (Thunk function), which enables the necessary adjustments to the THIS pointer and can be used to call the actual member function. Whenever a multi-inheritance is involved, each member function has such a formal conversion function, which is very effective for the function call. But this means that when using multiple inheritance, the members of the subclass will not work to the base class member function pointer. It can be seen that such a compiler requires more compiled code than other compilers.

Many compilers of embedded systems do not allow multiple inheritance. In this way, these compilers avoid problems that may occur: A member function pointer is a normal function pointer with hidden THIS pointer parameters.

Microsoft's "Smallest for Class" method

Microsoft's compiler uses a similar optimization method similar to Borland. They all have optimal efficiency in the case of single inheritance. But unlike Borland, Microsoft omitted a pointer entry (entry) having a value of 0 under the default condition, I said this technology is "Smallest for Class" method: a single inheritance class, a member function pointer Only the address of the function is saved (m_func_address), so it has 4-byte long. For multiple inheritance classes, it has 8-bytes long since the offset address (M_Delta) is used. 12 bytes of virtual inheritance will be used. This method does save space, but there are other problems. First, conversion between a member function pointer between the subclass and the base class changes the size of the pointer! Therefore, the information will be lost. Second, when a member function pointer declares before its class definition, the compiler must calculate how many spaces to be assigned to this pointer, but doing so is unsafe, because the compiler cannot know the inheritance of this class before defining the way. For Intel C and early Microsoft compilers, the compiler speculates only the size of the pointer. Once you guess the error in the source file, your program will inexplicably crash when running. Therefore, some reserved words have been added to Microsoft's compiler: __ kindle_inheritance, and __multiple_inheritance, and __virtual_inheritance, and add some compiler switches, such as / vmg, so that all member function pointers have the same size, but The empty part of the original header member function pointer is filled with 0. The Borland compiler also added some compiler switches, but did not add new keywords. Intel's compiler identifies those keywords added to Microsoft, but it does not process these keywords in situations where you can find classes.

For MSVC, where is the compiler you need to know where the class is ", there will be a THISP offset (VTORDISP), this value is constant for all the members in this class, but for each A class will be different. For MSVC, the adjusted THIS pointer is calculated by the original THIS pointer:

IF (vindex = 0) // If it is not a virtual inheritance (_Virtual_inheritance)

AdjustedThis = this delta

Else // If it is

AdjustedThis = this delta vTordisp * (* (this vtordisp) vindex)

In the case of virtual inheritance, the value of VTORDISP does not save in the __virtual_inheritance pointer, but when the code is found, the compiler is "embedded" in the compiler. However, for the inheritance of unknown types, the compiler needs to determine its inheritance type as much as possible, so the compiler divides the virtual inheritance Pointer into two types (__virtual_inheritance and __unknown_inheritance).

In theory, all compiler designers should change and break through the implementation of the MFP (member function pointer). But in fact, this is not good, because this makes it possible to change the current number of code written. Microsoft has published a very old article (http://msdn.microsoft.com/archive/en-us/dnarvc/html/jangrayHood.asp) to explain the implementation details of Visual C operation. This article is written by Jan Gray, and he has designed Microsoft C object model in 1990. Although this article was published in 1994, this article is still very important - this means that the C object model is not changed in 15 years (1990 to 2004). Now, I think you know too much about the members' function pointer. What is the point? I have established a rule for you. Although various compilers have a lot of implementation methods in this regard, there are also useful common: no matter which form of class, the assembly language code generated by calling a member function pointer is exactly the same. There is a special case to use the non-standard compiler of "Smallest for Class" technology, even in this case, the difference is also very small. This fact allows us to continue to explore how to build high-performance delegates.

(to be continued)

转载请注明原文地址:https://www.9cbs.com/read-17720.html

New Post(0)