C object memory layout
Write this article is exactly because you want to figure out how VC is layout each C object, and how to complete the transition of the pointer.
First asked a question, after two different types of pointers conversion, are they the same value? For example:
INT nvalue = 10; int * pint = & nvalue; void * pvoid = pint; char * pchar = (char *) PINT;
Is the value of these pointers (not the content of the memory pointer to the content)? If your answer is YES, if it is a class's inheritance system? In the process of inheriting the class to the base class, the value of the pointer Still do not change? If your answer is "not necessarily change, what is the system you want to see", "Congratulations, don't look at it. If you still don't sure it is still changed, Which changes are unchanged, why do you want to change, then look down.
C standard does not specify the specific memory layout of the object when C implementation, in addition to small restrictions in some respects, the layout of the C object is completely decided by the compiler itself by the compiler, here I just discuss VC .NET In the implementation of the 2003 Build 7.1.3091, I didn't test it in VC5 VC6 VC.Net 2002 and other 2003 buildings, and the conclusion may not be suitable for those compilation platforms. These are the specific implementation of the compiler, and MS is not notified. Make changes in your circumstances. So much nonsense, start now.
For the conversion of C's built-in pointer, the result is not to discuss, we just discuss C objects. From the simplest start.
Class CBase {public: int m_nbasevalue;};
Such a class is very simple in memory. He has 4 BYTES space. Don't say more, we have derived a class from him.
Class CDERIVE1: PUBLIC CBASE {public: int m_nderive1value;};
CDERIVE1 is put on in memory? It is also very simple, occupying 8 Bytes space, the first 4 Bytes belong to the CBase class, the last four Bytes belong to yourself. The pointer of a CDerive1 converts into a CBASE pointer, result Yes, let's add multiple inheritance.
Class cfinal: public cderive, public cbese // Here CDerive is a base class that is almost almost almost {public: int m_nfinalValue;};
CFinal's object is slightly complicated in memory, but it is easy to imagine. He has 12 BYTES space, the top 4 belongs to CDerive, the middle 4 belongs to CBase, then 4 is yourself. That cfinal The pointer is converted into a CDerive pointer, will the value become a CBASE pointer? Can it change? The answer is that the previous one is unchanged, the next one wants to change, the reason is very obvious, the beginning of the CFinal object is just a CDerive. Objects, and CBase objects are in the middle of the cfinal object, naturally change, how do you change? Add 4 to ok (naturally check whether it is an empty pointer).
CBASE * PBASE = PFINAL? (CBASE *) PFINAL SIZEOF (CDERIVE)): 0; // When you write PBASE = PFINAL, it is actually like this.
This inheritance without Virtual is so simple, just plus an offset. Let's take a look at what is like when adding the Virtual Function?
Start from the simple class.
Class CBase {public: Virtual void VirtualBaseFunction () {} INT m_NBASEVALUE;}; here is deliberately did not use Virtual Destructor, because this function is slightly different. Is the same problem, how much space is CBase class? Or 4 BYTES? ? The answer is No, in my compiler is 8 bytes, more 4 bytes is __vfptr (the name seen by the watch window), he is a pointer, pointing to the class vTable, what is VTABLE, he is What is used? Vtable is used to support the Virtual Function mechanism. He is actually a function pointer array (not equivalent to the array of pointers in the C / C language, because their type is not necessarily the same.) He Each element points to a Virtual Function you define, so that the effect of the dynamic compilation through an intermediate layer, these pointers are ready when the program is running, rather than being prepared when compiling, this Is the purpose of the dynamic network, specifically to set these pointers? Constructor / destructor / copy constructor / assignment operator They are done, don't weird, the compiler will be inserted in these functions in these functions. To set the VTABLE value, if you don't write these functions, the compiler will help you generate these functions when you are appropriate. Make a little, vtable is used to support the Virtual Function mechanism, and the class that requires the Virtual mechanism basically consists of one __vfptr Points his own vTable. When calling the Virtual Function, the compiler is completed:
PBASE-> VirtualBaseFunction (); => PBase -> __ vfptr [0] ();
// 0 is your Virtual Function in the slot number in the vTable, the compiler decides
It should now be imagined, what is the size of the CBASE? What is the location of __vfptr? Or after m_nbasevalue? In my compiler, it is before, why do you want to put it before it is because When calling the Virtual Function by pointing to the pointer to the class member function (referring to the profile code), this reason is not discussed here, interested students can look at the INSIDE THE C Object Model book.
Next, we plus inheritance to see.
Class CDerive1: Public CBase {public: Virtual Void VirtualIve1Function ();
At this time, you may say that the memory layout is the same as if Virtual is, but only one __vfptr, you ... this is wrong, share the same one in my compiler. _Vfptr, there are two pointers in vtable, one is two class shares, one is only a cderive1 class, what is the time when calling?
Pderive1-> VirtualDerive1Function () => Pderive1 -> __ vfptr [1] (); Pderive1-> VirtualBaseFunction () => Pderive1 -> __ vfptr [0] ();
As for the mutual conversion of the pointer, the value is still no change (also pursuing this effect, so putting the __VFPTR to the beginning of the class, because adjusting the THIS pointer is also to occupy the time of time).).
Now, add multiple inheritance, the code I don't write, just like the CFinal, CDerive, CBase system above, just one of the VirtualxxxFunctions, this time the pointer adjustment is still there, so we just see Seeing vtable, you will say CDerive and cfinal share __vfptr, and CBase has a __vfptr, and cfinal's __vfptr has 2 slot, this conclusion is correct. At the same time, you will also say calls via CFinal class call The CBASE function is to make a pointer adjustment, yes you'r Right, not just this pointer adjustment (, this pointer becomes a parameter of Function), but also adjust the value of VTable: pfinal-> virtualbasefunction () => (CBASE *) ((char *) PFinal SizeOf (CDerive)) -> __ vfptr [0] ();
The code converted to ASM is about this:
Mov Eax, [PFinal]; PFinal IS A Local Object, PFINAL WILL BE EPB - XX Add Eax, 8; 8 = SizeOf (CDerive) MOV ECX, EX; ECX IS this Pointer Mov Edx, [EAX]; EDX = VTable Address Call [EDX]; Call vTable [0]
It is also known to have a THIS pointer to adjust. The inheritance of Virtual Function is not complicated, and the THIS pointer adjustment is also very simple, and the most complex part of the most complex part of Virtual Inheritance.
My compiler supports virtual inheritance and virtual functions, all through a table, just can't see the name given by VC, we call him vbtable, the compiler also adds a point in the class Vbtable's pointer, we call him __vbptr, this pointer points to VBTable, and each item in VBTable corresponds to a base class, VBTable records a certain offset of each base class, through this offset Can calculate the location of the specific class pointer. See a simple example:
class CBase {public: virtual ~ CBase () {}}; class CMid1: public virtual CBase {public: virtual ~ CMid1 () {} int m_nMid1;}; class CMid2: public virtual CBase {public: virtual ~ CMid2 () { } Int m_nmid2;}; class cfinal: public cmid1, public cmid2 {public: virtual ~ cfinal () {} int m_nfinal;}; cfinal finAl; cfinal * pFinal = & final; // pFinal = 0x0012FEB4; CBASE * PBASE = PFINAL; // PBase = 0x0012FEC8 = pfinal 0x14; cmid1 * pmid1 = pfinal; // pmid1 = 0x0012FEB4 = PFinal; cmID2 * PMID2 = pFinal; // pmid2 = 0x004210b4 = pFinal;
The result is surprised? The most strange place actually is the address of CMID2 and CMID1, this is because VC puts VBTable in the beginning of the CFinal class, and CMID1 and CMID2 also have to use this VBTable, so this The three addresses must be the same. How did the address of the CBASE come out? Hey ... I just said that Vbtable was put on the beginning of CFinal (VC will be started at the beginning? The answer is not necessarily, this is a little After explaining the first DWORD of the last DWORD in my machine, the first DWORD is 0x00426030, check this address, the first DWORD is 0, the second is 0x14, just the same as the PBASE, this is just coincidence, maybe You change the inheritance system of a class, but I just want to explain that the offset of the base class is associated with the value of VBTable. Let's take a look at how the VC is calculated to calculate these offsets .vc When analyzing our code, a class's inheritance system information is generated, with a _pmd structure called thisDisplacement:
Struct _pmd // total undocuplented {int mdisp; // i think the meaning is multiinheritance displacement int pdisp; // pointer to vbtable displacement int vdisp; // vbtable displacement};
The name of the structure and the name of the member variable are indeed the name of the VC (in the Watch window input (_PMD *) 0 can see the details of this structure), the meaning of each field is what I guess. MDISP Probably used to indicate the offset when multiple inheritance (including single inheritance), PDISP represents the offset of the VBTable, and VDISP represents the subscript in the vbtable. So how do this structure to complete the conversion of the pointer? If we have a derived class pointer PFINAL, to be converted into a specific basic class, we must first know the _pmd structure corresponding to this base class (this information is acquired, I have not found a very convenient method, now The method I use will be described below). After this information, the conversion is convenient. First find the address * (PFinal PDISP), then find the offset of the base class * (* (PFinal PDISP) VDISP This offset value is relative to VBTable, so adds to the offset of VBTABLE, and finally adds MDISP offset, as follows:
Char * pfinal = xxx; // NEED A INIT VALUE CHAR * PBASE; // We must Calc PBase = PFINAL MDISP * (INT *) (PFINAL PDISP) VDISP) PDISP;
Note: When PDISP <0 is said that this class does not use PFinal MDISP directly. So this structure is a universal structure, which is specifically used as a type conversion, whether there is no virtual inheritance. Type conversion.
Through this structure, we can also see how the VC is layout this Object.
Seeing this, maybe you have to blame a breath, mother, a type conversion wants to have this trouble? I will write PBASE = PFINAL directly? Congratulations, I haven't been fooled, I haven't been thrown, haha. In fact you When writing the row of statements, the compiler will help you do this conversion, about generating the following code.
MOV EAX, [PFINAL]; Final Address Mov ECX, [EAX]; VBTable Address * (INT *) (PFINAL PDISP) MOV EDX, EAX; Save to Edx Add Edx, [ECX 4]; ECX 4 IS ( * (int *) (PFINAL PDISP) VDISP) MOV [PBASE], EDX; EDX = PFINAL MDISP * (INT *) (PFINAL PDISP) VDISP) PDISP; Here MDISP = 0, PDISP = 0, VDISP = 4 Maybe you have to say, what do I want to do? Direct conversion when I want to convert, the compiler will help, indeed, most of the time, However, at some point, it is not this, now you have to implement a function, enter a pointer, enter a _pmd structure, you have to implement an AdjustPointer function to generate another pointer. This time you can only complete this. Because I didn't give you the names of the two pointers, I haven't used the name of your string form. Oh .... You may say that there is something, it is true, template can realize this Function, huh .. This will not discuss specific implementation details. Maybe you have to ask, when you will achieve this, there is no function, in fact, this function is really existing, just not by you To achieve, but MS people are implemented, you only write a program with C exceptions, use the IDA to disassemble, then find the function, you can find this function, he used to create a Catch in an exception handling What is needed. As for this detailed information, please look forward to it. I will write the fastest to how the VC implements the C exception.
Finally, let's talk about the way the _pmd structure. Don't surprise, the method is more trouble, such as I want to know the _pmd information related to the cfinal class, first new work, write the statement like throw pfinal, compile In this statement, set breakpoints, run, go to disassembly, enter __cxxthrowexception @ 8 functions, this time you can't read something called PTHROWInfo (if you can't see, please open "display symbol name "Options), enter PTHROWINFO in the Watch window, expand him, see a PcatchabletypeArray, record the value of his ncacthabletypes, then enter in Watch
pThrowInfo-> pCatchableTypeArray-> arrayOfCatchableTypes [0] to pThrowInfo-> pCatchableTypeArray-> arrayOfCatchableTypes [n], n is 1 minus the value you just recorded, and then expand them, you can see a thisDisplacement of data, it is to continue mdisp Wait, it is very trouble. Ha .. You have guessed, this is related to abnormalities.