Tianfang Night Tan VCL: Polymorphic Worm
Our Chinese worships the dragon, the so-called "Longsheng nine, nine different dors". Which nine? "Journey to Journey" said the Sun Wukong: "The first small Huanglong, see Huaiwu; the second little dragon, see Email; the third greenback, accounting for Jiangwei; fourth Red Ronglong, the town behind, the fifth Brano, with the Buddha Si Zhong; the sixth and stable beast dragon, and the gods of the gods; the seventh Jingzhonglong, and Yu Di Gu Tianhua watch; eighth Dragon, in the brothers of the brothers. This is the ninth dragon, because of the young years, he is in the old year, he is in the native of Hengliu River, waiting for a name, don't move, who knows him Comply with my descendants, collide with Dasheng. "(Note: Dragon is the words of elegance, the folks are called pigs, which is the Yangzi crocodile.) If you say" Let's Go ", that Spectacular, there is sky, there is water, and there is also the ground climb. The specific form of "GO" and "Go" is different, which is a typical example of the polymorphism "an interface, multiple implementations".
Many implementations of polymorphism, where C direct support is available: to provide a virtual function after the keyword Virtual provides a delay, and the template (Template) achieves static polymorphism, and they all have a variety of places. We are more familiar with virtual functions, this is an important means of constructing class level, and we have also analyzed the principle of virtual functions [1]. In some cases, the performance of the virtual function is not optimal, so the VCL also provides a dynamic function, the usage and virtual functions are exactly the same, as long as Virtual is replaced with Dynamic. The VCL's help file said that the dynamic function is compared with the virtual function, the space efficiency is dominated, the time efficiency is not good, really? What is the principle of it? How should we weigh the use of these two? We will discuss these issues from a quite average angle.
Virtual function
A section of a graphics drawing program is from the following level. For convenience of management, the interface is separated from the specific graphic design. Various graphics is provided in a dynamic connection library, as in the form of a plugin. This can increase or reduce various graphics without recoiling the main program.
Figure 1 Shape class level
The initial Shape statement is
Class shape {
Private:
INT X0, Y0;
protected:
Shape ();
Virtual ~ shape ();
PUBLIC:
INT x () const;
Int y () const;
Virtual void Draw (void *) = 0;
Virtual Int Move (int, int);
}
Later, due to function expansion, two virtual functions were added.
Class shape {
Private:
INT X0, Y0;
protected:
Shape ();
Virtual ~ shape ();
PUBLIC:
INT x () const;
Int y () const;
Virtual Int Move (int, int);
Virtual void Draw (void *) = 0;
Virtual void save (void *) const = 0;
Virtual void load (void *) = 0;
}
Later, some modifications have been made, and a number of virtual functions have been added. The problem is that the virtual function table VFT changes, this time, the main program must be recompiled at this time. Worse, once the version is upgraded, derived from the different versions of Shape is definitely not mixed [2]. So we can see the hard drive full of mfc20.dll, mfc40.dll, mfc42.dll ... but one cannot be deleted, this is the DLL garbage brought by the MFC upgrade. How to do?
Initially solved
I have asked such a problem online, and the response to get is mainly:
Use COM; write some useless virtual functions in advance, leave the expansion space.
In fact, the above method can solve this problem very well. But promotion seems to have certain limitations. COM is not suitable for solving the level of classes, and the reserved space is a "chicken rib" that is not buckled.
Finally, this limiter is because the parent class and subclass's virtual function table VFT has a strong correlation: the front portion of the VFT of the subclass must be the same as the parent class. This requirement is difficult to satisfy when the parent class and subclass are not in the same DLL or EXE. Once the parent class changes, if the subclass will cause errors if it is not recompiled. The method of solving, of course, is the correlation between the parent class and the subclass VFT. I have designed a very stupid solution, but I can cancel this correlation, so that the virtual function is guaranteed to have only 2.
#define dynamic // Dynamic nothing, just look good
Struct Point
{
INT X, Y;
}
Class dispatch_error {};
Class shape {
Private:
INT X0, Y0;
protected:
Shape ();
Virtual ~ shape ();
Virtual Void Dispatch (int ID, void * in, void * out);
// IN and OUT are the input output parameters of the function, the ID is the only tag symbol of each function, ie the code
// In the actual application, the ID is not necessarily an integer, or 128-bit UUID, or a string, etc.
PUBLIC:
INT x () const;
Int y () const;
Dynamic Int Move (int DX, INT DY)
{
Int r;
POINT P = {DX, DY};
Dispatch (-1, & p, & r);
Return R;
}
Dynamic Void Draw (Void * HDC) {Dispatch (-2, hdc, 0);}
Dynamic Void Save (void * o) const {dispatch (-3, o, 0);}
Dynamic Void Load (Void * i) {Dispatch (-4, i, 0);}
}
Void Shape :: Dispatch (Int ID, Void * in, Void * OUT)
{
Switch (ID)
{
Case -1:
...
Case -2:
...
...
DEFAULT:
Throw (Dispatch_ERROR ()); // If the function does not exist, throw an exception
}
}
If the subclass triangle wants to overwrink Shape :: Draw, then only need
Void Triangle :: Dispatch (Int ID, Void * in, Void * OUT)
{
Switch (ID)
{
...
Case -2: // Remove Shape :: DRAW
...
...
DEFAULT:
Shape :: dispatch (id, in, out); // Function does not exist, find the parent class
}
}
Such "Dynamic Functions" solved the previous problems, only the descent and DISPATCH two virtual functions. There is no correlation between the parent class and the subclass of VFT, which can be free to modify without mutual impact. Commentary
Let's evaluate this solution: independently solve the problem of virtual functions, but also pays a small price: time efficiency and readability, this also determines the application of the program is not wide, generally in
The virtual function is rare or almost no longer need to be rewritten. This helps to reduce the size of the VFT. As for the running speed, there is nothing improvement. After all, the access speed of VFT is constant level [3]; the parent class needs to be updated frequently and subclass is not convenient to update, and the efficiency requirement is not high. The general application can be used.
From the perspective of pattern (Patterns), this method is a typical duty chain [4]: The call request starts from the lowest layer class to pass until it is processed or finally thrown abnormal. This mode is very wide, such as the VCL message mapping [5] and COM Idispatch interface [6], which is very similar to the above solution.
This solution can also be further improved to better apply to the framework of single structure. For example, a single structure class, such as MFC and VCL, the unique parent class can be found via RTTI, then data (function code and pointer) and code (all) are separated to simplify the structure. The method of solving is a typical table driver, and there are many books [7, 8] to optimize QueryInterface in the iUnkown interface in COM. We introduce a class DMT to store the code and pointer of the function.
#include
Using namespace std;
Class DMT {
Char * const ptr;
Const DMT * const parent;
PUBLIC:
DMT (Const DMT * Const, Const Int, ...);
~ DMT () {delete [] PTR;}
Short size () const {return * (short *) ptr;}
Const void * find (int) const;
}
Figure 2 DMT diagram
Special attention is required is the space allocated by DMT :: PTR. On the 32-bit system, for N "Dynamic Functions", you need a sizeOf (SHORT) byte to save n (red part), sizeof (void *) * n-byte storage function code (yellow part), and sizeof (void * * n byte storage function pointer (blue part), a total of SizeOf (Short) 2 * n * sizeof (void *) byte. Subclass and parent class DMT can be connected in the form of a chain list. Let's take a look at the implementation of DMT :: Find and DMT :: DMT.
Const void * DMT :: Find (INT I) Const
{
const Int * Begin = (int *) (PTR SIZEOF (Short)), * P;
FOR (P = Begin; P IF (* (int *) p == i) Return * (Void **) (P Size ()); / / After finding the corresponding function code, jump forward DMT :: size () is the corresponding function pointer RETURN (PARENT)? PARENT-> Find (i): 0; } DMT :: DMT (Const DMT * Const P, Const Int n, ...) : PARENT (P), PTR (New Char [SIZEOF (Short) 2 * n * sizeof (void *)]) // PTR assignment space size as previously described { INT * I = (int *) (PTR 2), C; * (short *) PTR = n; // Remove N (red part) to the head SIZEOF VA_LIST AP; VA_START (AP, N); For (c = 0; c * (i ) = VA_ARG (AP, int); Typedef void (DMT :: * Temp_type) (); TEMP_TYPE TEMP; For (c = 0; c { Temp = VA_ARG (AP, TEMP_TYPE); * (i ) = * (int *) & temp; } VA_END (AP); } Below we apply DMT classes in the Shape class level. Class shape { Private: INT X0, Y0; Void Int f_move (void * dx, void * dy) {...} protected: Static const DMT DMT_SHAPE; // Shape class DMT Const DMT * const Dmt; // Pointer to this class DMT Shape (): DMT (& DMT_SHAPE) {...} Virtual ~ shape (); Void dispatch (int ID, void * in, void * out) // This time is not a virtual function! { Void (a :: * f) (void *, void *); * (const void **) & f = DMT-> Find (ID); (THIS -> * f) (in, OUT); } PUBLIC: INT x () const; Int y () const; Dynamic Int Move (int DX, INT DY) { Int r; POINT P = {DX, DY}; Dispatch (-1, & p, & r); Return R; } Dynamic Void Draw (Void * HDC) {Dispatch (-2, hdc, 0);} Dynamic Void Save (void * o) const {dispatch (-3, o, 0);} Dynamic Void Load (Void * i) {Dispatch (-4, i, 0);} } Const DMT Shape :: DMT_SHAPE = DMT (0, 4, -1, -2, -3, -4, & shape :: f_move, 0, 0, 0); The background protrusion is a change in the place. If the subclass triangle wants to overwrink Shape :: Draw, then only need Class triangle { Private: Void f_draw (void *); ... protected: Static Const DMT DMT_TRIANGLE; ... PUBLIC: Triangle () {DMT = & DMT_TRIANGLE; ...} ... } Const DMT Triangle :: DMT_TRIANGLE = DMT (Shape :: DMT_SHAPE, ..., -2, ..., & triangle :: f_draw ...); This is another implementation of "Dynamic Function", which can be separated by data and code. Of course, this example does not have actual application value, there are many problems in static member initialization, call agreement, readability, etc., only demonstrates. Dynamic function Object Pascal provides two functions to achieve polymorphism: one is a virtual function we are familiar with, and the Dynamic function is the support for the language level provided by the previous "Dynamic Function". There may be some friends who use C Builder, how do you see it in C Builder? In C Builder, the macro (Macro) identifying a dynamic function is Dynamic, which is __Declspec (Dynamic), which is Borland to C expansion. Like TControl :: Click, TControl :: Mousemove, etc. is dynamic functions. Dynamic's usage and virtual basics, the difference I found is just that when the subclass rewrites the parent class, Virtual can be omitted, while Dynamic is not. So, where is the entry of each class's dynamic function? Last time, we have dug a distribution map of VMT, there is VMTDynamictable = 0xffffffd0, let us tell us, which is the entry of dynamic method table DMT (Dynamic Method Table). Take it. #include #include STRUCT A: Private TOBJECT { Dynamic void f1 () = 0; Void f3 () {} Virtual void f4 () {} Dynamic void f2 () = 0; } Struct B: a { Dynamic void f1 () {} Dynamic void f2 () {} } void main () { A * p = new b; Std :: cout << (void *) P < F1 (); P-> F2 (); Delete P; } This program will output "0118095C". Of course, this value in different machines may vary, in short, I will write down. Where P-> F1 (); the assembly code is Push DWORD PTR [EBP-0x30] OR EDX, -0x01; This sentence is actually equivalent to MOV EDX, 0xfffffffff MOV EAX, [EBP-0x30] Call System :: FindDynSt (void *, int) Call EAX POP ECX P-> F2 (); assembly code is Push DWORD PTR [EBP-0x30] MOV EDX, 0xffffffe MOV EAX, [EBP-0x30] Call System :: FindDynSt (void *, int) Call EAX POP ECX The program is very simple, we explain it. OR EDX, -0X01 with MOV EDX, 0xffffffFFFFFFF is exactly the same, any number and 0xffffffFFF to "or" the result of the calculation is of course 0xfffffffffffffff; these two unique differences are MOV EDX, 0xfffffffff (ory or edx, -0x01) and MOV EDX, 0xFffffe, we have already said that the complement indication method is said, which is actually the function code -1 and -2 of the A :: F1 and A :: F2 respectively; MOV EAX, [EBP-0X30] After this sentence, we can find that the value of EAX is the number of us, which is just now, which includes a pointer to the VMT entry; two parameters incorporated to System :: FindDyn ·St are included Pointer to the pointer of the VMT entry, as well as the code of the corresponding function, in EAX and EDX; This is the whole process. Now we care about what is done in system :: finddynainst (void *, int). We can track it in, and then jump, we come to the function, the source code is _FindDyn Instation in Source / Vcl / System.PAS. Procedure_finddynainst; ASM Push EBX MOV EBX, EDX; EBX stores the code of the function MOV EAX, [EAX]; EAX gets the VMT entrance address Call getDynamethod; call getDynamethod MOV EAX, EBX POP EBX JNE @@ EXIT POP ECX JMP _ABSTRACTERROR @@ EXIT: END; Then we have to look at the source code of get Dynamethod. Procedure getDynamethod; {Function GetDynamethod (VMT: Tclass; Selector: Smallint): Pointer;} ASM {-> EAX VMT of class} {Bx Dynamic Method Index} {<- EBX POINTER TO ROUTINE} {Zf = 0 if found} {TRASHES: EAX, ECX} Push EDI XCHG EAX, EBX; Value of Eax and EBX JMP @@ Havevmt; After switching EBX is the VMT entry address, EAX is a function code @@ outerloop: MOV EBX, [EBX]; take address @@ Havevmt: Mov Edi, [EBX] .vmtdynamictable; EDI is the entrance to DMT TEST EDI, EDI; Testing whether there is DMT (whether EDI is 0) JE @@ Parent; If DMT does not exist, continue to find in the parent class Movzx ECX, Word PTR [EDI]; Take the two bytes, namely the number of dynamic functions Push ECX Add EDI, 2; Jump to the yellow portion (see the figure later) Repne scasw; find EAX JE @@fact; if you find it, jump POP ECX @@ Parent: MOV EBX, [EBX] .vmtparent; continue in the parent class Test EBX, EBX; Is there a parent class? JNE @@ outerloop; there is a continuation of JMP @@ exit; not tripping @@fact: POP EAX Add Eax, Eax; The following two steps are to clear the ZF, where the ECX value is 0 Sub Eax, ECX {this Will Always Clear The Z-Flag!} MOV EBX, [EDI EAX * 2-4]; EDI-1 is where the function code is located @@ EXIT: POP EDI END; See if you want to make your head? Hey, look at this picture clearly. The address pointed to by VMTDYNAMICTABLE is a DMT, and its structure, we have already analyzed in front. The only thing that needs to be explained is Add Eax, EAX; EAX value is N, self-adding 2 * N Sub Eax, ECX; ECX value has been reduced to 0, this sentence is only clear ZF flag MOV EBX, [EDI EAX * 2-4]; Clear ZF is because _finddynainST is to determine if the corresponding function is found. EDI-4 is where the function code is located, EDI-4 4 * n is the function pointer, that is, EDI EAX * 2-4. In fact, it doesn't need to be entangled with compilation. In the front, we already know the principles, Datong small. Conclude The reuse of C is for the source code level, and the support of the binary-level reuse is stretched. In particular, the wide application of dynamic connecting library DLLs is more showing the importance of solving this problem. One of the slogans of COM is COM AS A BETTER C 7. In the book of COM, there is often a shortcoming of several C , in fact, there is a lot of solutions. such as Problem: Different compilers' name crushing mechanisms, causing modules compiled different compilers to be connected smoothly. Solve: Use the DEF file. Consideration: trouble, increase maintenance burden, but there is no impact on program efficiency. Question: The size of different versions is different, mainly because members variables increase or decrease, resulting in error when allocating space. Solve: Hide implementation, member variables only keep a pointer void *, dynamically apply for space during runtime. Consideration: readability and performance are affected. Adding a regular member function has no big problem, but adding virtual functions affects VFT, which may cause the program error or even the system crash. The solution has been explained earlier, and good design is essential. Suggest The root design must be cautious, the VCL is from the beginning, the change of the TOBJECT class has always been rare, otherwise it will move the whole body, and the maintenance will be greatly reduced; the class level should be as shallow as possible, try to avoid the use of inheritance and other coupling Strong relationship, strictly follow the Liskov replacement principle LSP [9]; if the program only runs under Windows, consider using COM; if you always use the Borland compiler, it is not high, you can consider using the Dynamic function. Morning to write a few unwanted virtual function places, it is also a good way. Dynamic functions apply in a suitable place, this can refer to the usage of dynamic functions in VCL. In addition, the VFT space saved in the dynamic function is negligible, and there is more space in which the DMT has more space. Overall, the dynamic function is in time, and the inexpensive inexpensiveness is not large. In my opinion, the correlation between the parent class and subclass VFT is the greatest advantage of the dynamic function. Whether it is identifying materialism, but also Taoist thought, emphasizing both sides of things. Whether it is a method, it is a double-edged sword, the so-called "sorrowful, blessing, the rush of the blessings". What we have to do is to weigh the pros and cons, combined with specific environments, and lead to shortness. reference Worm. "Tianfang Night Tan VCL: Open". C View. 2001, 9. 2. George Shepherd, Brad King. Inside ATL. Microsoft Press. 1999. 3. STANLEY LIPPMAN. INSIDE THE C Object Model. Addison-Wesley, Reading, Ma. 1996 Hou Jie. "Depth Exploration C Item Model". A & Sifeng Information Co., Ltd. 1998. Houjie. "Depth Exploring C Object Model". Huazhong University of Science and Technology Press. 2001. 4. Gof. Design Patterns: Elements of Reusable Object-Oriented Software, MA. 1995. Li Yingjun and so on. "Design mode: can be used for object-oriented software software". Machinery Industry Press. 2000. 5. cker. "In-depth BCB understands the Message Mechanism of VCL". C View No. 1. 6. Dale Rogerson. INSIDE COM. Microsoft Press. 1997. Yang Xiuzhang. "COM technology insider". Tsinghua University Press. 1999. 7. Don Box. Essential COM. Addison-Wesley, Reading, Ma. 1998. Hou Jie. "COM nature". A & Sifeng Information Co., Ltd. 1999. Pan Aimin. "COM nature". China Electric Power Press. 2001. 8. Brent E. Rector and chris sells. ATL INTERNALs. Addison-Wesley, Reading, Ma. 1999. Pan Aimin, Xinwang. "ATL in-depth analysis". China Electric Power Press. 2001. 9. Robert C.martin. "The Lisk Substitution Principle". C Report. 1996, 3. Plevs, PLPLIULY. LiskoV Replacement Principle LSP. C View. 2001, 9.