Tianfang Night Tan VCL: Open the door
Foreword
If you love him, let him learn VCL, because it is paradise. If you hate him, let him learn VCL because it is hell. - "Tianfang Night VCL"
Legend has a long time ago, there is an island between China and India. The king there will marry a woman every day, kill after overnight, make the chicken and dogs, and finally the prime minister's daughter voluntarily married into the palace. On the first night, she talked a very interesting story, the king heard the fans, did not kill her the next day. Since then, she tells a strange story every night, I have been talking about the first thousand and one night, the king finally repentive. This is the famous "one thousand and one night", which is "Tianfang Night". India and Chinese land borders, then believe in the islands referred to in the legend, inevitably be in the South China Sea - Malacca Strait - a place in the Indian Ocean. Now I am also on a island of this, just a big name, just borrow a big name to "Tianfang Night".
In junior high school, my favorite programming environment is Turbo C 2.0, with Visual Basic. I didn't have long before I used it. If you want to make a slightly complicated thing, you need to keep check the information to call the API, you have to make a terrible API function declaration in front. So I started to miss the simple C language. I like to use Delphi's brother, I know that I am very resentful, bringing me to C Builder. Even for inheritance in C , these simple concepts are still a semi-solving, I actually started with VCL to edit some inexplicable applet (VCL to get rid of it), began to familiarize with VCL, and I also learned MFC and SDK. Take the basics of C . Later I felt that VCL was easy to use. In fact, VCL is quite difficult to learn, even more troublesome than MFC.
I don't know why, C Builder's information is surprisingly, perhaps this reason, the human love on the C Builder Forum is also particularly strong. Whether it is my beginner VCL, I often ask some ankle an inexplicably embassy problem, or now I often stop the 9CBS, and the C Builder Forum gives people always very warm. Every time C Builder is later than the same version delphi, each time I use C , I have to watch Object Pascal's face, I think this is a lot of people's feelings. The CLX has appeared in Delphi6, and the release of C Builder6 seems far away. Does the CLX replace VCL? It seems that it doesn't seem, it will mention it later. I also have seen many posts to call the VCL with C , often thundered heavy rain. Look at others foreigners, let's dry, a FreeClx project is so started.
People with MFC are better than using VCL, they have Microsoft's support, with Inside Visual C , Programming Windows 95 With MFC, MFC Internal Skilly English translation, Mr. Houjie, Mr. Houjie, "Deep Shallow MFC" (Ie DISSECTICTING MFC) These outstanding Chinese original works. People who use Delphi are also far better than using C Builder. About Delphi's wonderful information is much more than C Builder, it is very helpless, it is really helpless.
The editor of C View magazine is about me, I am very difficult, because time and technical levels become a problem. It is difficult to borrow Mr. Houjie, to refuse to live in the same day and night, day and night, day and night, day and night, is very difficult. So I am up to my mind, write a series of articles analyzing the internal principles of VCL. The so-called "Tianfang Night", of course, there will be no help for beginners, even will make you feel "boring". Friends facing these articles should be more familiar with VCL, there is a certain C basis (of course, it will be better to be better), such as friends who want to know the underlying operation mechanism of VCL, and hope to develop application frames or want to override VCLs with C friend. At the same time, I would also like to exchange experiences in anatomy anatomical application framework, so that we are not limited to VCL or MFC, can stand on a higher angle to look at the problem, together with your ability. Before discussing VCL in-depth, you have to let the main nature of VCL.
Like the framework with SmallTalk and Java, VCL is part of Object Pascal, that is, there is no clear boundaries between language and frameworks. For example, Java has JDK, write a class is a subclass of java.lang.Object. VCL and Object Pascal are the same. Of course, Object Pascal still allows a class without any parent class in order to comply with the previous Pascal, but this series will no longer consider this situation. Like most frameworks, VCL takes a single structure. That is, the structure of the VCL is inherited with a TOBJECT, and all VCL classes outside TOBJECT are subclasses directly or indirectly in TOBJECT. Due to the language characteristics of Object Pascal, only single inheritance is used throughout the structure.
Therefore, the essence of VCL is an Object Pascal class, providing Object Pascal and C two interfaces. In the process of analyzing, please keep in mind this.
The organizational structure of the article is a matter of on behalf of the matter, once a topic. Since the VCL is not like an SIM, it is very close to Object Pascal, IDE, compiler, so it will be contained in the profiling process. Of course, friends who don't compile will not be afraid, I will explain the assembly code clearly and try to use C to change.
There are many pictures in the text that represent the memory structure of the class, as shown. The box represents a variable, and the two ends extend indicatives there are several variables, and the elliptical label is to illustrate the entire object in the dashed circle (not drawing in the back dashed circle).
Figure 1
The procedure in the article can be compiled in the Console Application mode, such as non-special instructions (if you use the VCL class, you need to check "Use VCL").
Open door
Unfortunaters are like Yugong, I will see the Taihang, Wang Hushan. Under an angry, he began to move to the mountain, and finally, fortunately, God helped it. Chinese people don't like to open the door, the personality of the mountain may be that Yugong is passing down, and the old love is around. Of course, I can't help but say, a lot of nonsense in front, and now come.
Lift RTTI (Runtime Type Identification, runtime type identification), I believe everyone is very familiar. The RTTI function of C is quite limited, mainly provided by TypeID and Dynamic_CAST [1]. As for the implementation of the two [2], it is not our topic today, and we are concerned, is the "advanced" RTTI under the VCL.
Friends who are familiar with the frame know that the framework will often provide "advanced" RTTI functionality. I have seen an argument, saying Java and Object Pascal than C , because their RTTI is more "advanced". Regardless of the abuse of RTTI is extremely harmful, in fact, the C macro (Macro) can also simulate RTTI [3] of the same function. However, for the VCL class, do you know the operation of the RTTI mechanism? For the following
Class A: Public Tobject
{
...
}
...
A * p = new a;
Why is P-> ClassName (); can return to the name "a" of class A?
Why A :: ClassName (P-> ClassParent ()) can return A of the base class name "TOBJECT"?
why……?
In fact, this is the result of the compiler's dark box operation. To put it bluntly, the compiler first writes the class name in a certain place. The key is how to take it out? Obviously have a pointer points to this data, where is these pointers placed?
Remember the story of "Alibaba and forty stolen"? Treasure is there, if you know how to "sesame, open the door", you can get treasures. Similarly, the relevant information of the class is the compiler to help us write well, what we care about, how to get the "mouth" of this information.
But all this, you have to start from the virtual function, we have to review the C / C object model first.
Virtual function table VFT
The C language provides an object-based thinking model with a very clear object model. such as
Struct a
{
INT I;
Char C;
}
Figure 2 Memory layout of the structure
On the 32-bit system, the variable i occupies 4 bytes, and the variable C takes up one byte. The compiler may also add 3 bytes to the back. So, SIZEOF (A) is 8.
C provides an object-oriented thinking model, and its object model is based on C. For classes without virtual functions, their model is exactly the same as the structure in C. But if there is a virtual function, there is a pointer VPTR in some part of the class entity to point to the entry of the virtual function table VFT (Virtual function Table). Obviously, this VPTR is the same for all objects of the same class. E.g
Class A
{
Private:
INT I;
Char C;
PUBLIC:
Virtual void f1 ();
Virtual void f2 ();
}
Class B: Public A
{
PUBLIC:
Virtual void f1 ();
Virtual void f2 ();
}
When we call it as follows
A * p;
...
P-> F2 ();
The program itself does not know that it will call A :: F or b :: f or other functions, just through the VPTR in the class entity, find the VFT entry, and query the function address in the entrance. Since the Borland C compiler puts the VPTR in the header of the class entity, there is this assumption below.
In order to more fully explain the problem, we analyze it from the assembly level. Suppose we use the Borland C compiler.
P-> F2 ();
The assembly code for this sentence is
MOV EAX, [EBP-0x04]
Push EAX
Mov Edx, [EAX]
Call DWORD PTR [EDX 0x04] POP ECX
Figure 3 C entity memory layout
The first sentence EBP-0x04 is the address of the pointer variable P, the first sentence is to transmit the address of the object points to the P to Eax; the second sentence does not need it; the third sentence is to transmit the pointer VPTR of the object head to EDX , That is, the entrance to the VFT; the fourth sentence is the key, EDX plus 4 (a pointer on the 32-bit system accounts for 4 bytes), that is, the second function pointer from the VFT entry, that is, B :: F2; the fifth sentence does not need to be used.
I believe that everyone has a more deep understanding of the object model of VFT and C ? For VFT implementations, each compiler is different. Interested friends may wish to explore the implementation of Microsft Visual C and GCC, comparing their excessive differences.
Know the structure of the VFT, then think about what the result of this program is.
#include
Using namespace std;
Class A
{
INT C;
Virtual void f ();
PUBLIC:
A (int V = 0) {c = v;
}
void main ()
{
A a, b (20);
COUT << * (void **) & a << endl;
Cout << * (void **) & b << ENDL;
}
I think you should understand it * (void **) & a? This is the value of VPTR, that is, the first 4 bytes of the memory space in which A is located, and a pointer. Below we will use similar statements.
No doubt, the result is to output two identical values. As we have already said, for all objects of the same class, its VPTR value is the same.
So what is the role of this VFT? It seems that it seems to store the address of the virtual function.
Virtual method table VMT
How to find the related RTTI information of the class through the class's entity? Obviously, VFT is data shared by all entities of the same class, while RTTI is just also. So, put the RTTI in the VFT, it is a good choice.
Where is it? VFT starts from the entrance to each virtual function pointer, then RTTI can only be placed in two places: after the entrance is previously or after all virtual function pointers. Obviously, it is better before the entrance, at least we don't have to care about the virtual function, and the position of the RTTI can be relatively determined.
The VCL uses this method to place RTTI, but the VFT is changed, called the virtual method table VMT (Virtual Method Table). What is the structure of the VMT? There is no relevant information in the help file provided by Borland, but we can find the following spider marts in include / vcl / system.hpp.
Static const shortint vmtselfptr = 0xffffffb4;
Static const shortint vmtintftable = 0xffffffb8;
Static const shortint vmtautotable = 0xfffffBC;
Static const shortint vmtinittable = 0xffffffc0;
Static const shortint vmttypeinfo = 0xffffffff4;
Static const shortint vmtfieldtable = 0xfffffffc8;
Static const shortint vmtmethodtable = 0xffffffcc; static const shortint vmtdynamictable = 0xffffffd0;
Static const shortint vmtclassname = 0xffffffd4;
Static const shortint vmtinstancesize = 0xffffffd8;
STATIC const shortint vmtparent = 0xffffffdc;
Static const shortint vmtsafecallexception = 0xffffffe0;
Static const shortint vmtafterconstruction = 0xffffffe4;
Static const shortint vmtbeforeDestruction = 0xffffffe8;
Static const shortint vmtdispatch = 0xffffec;
Static const shortint vmtdefaulthandler = 0xffffff0;
Static const shortint vmtnewinstance = 0xfffffff4;
Static const shortint vmtfreeinstance = 0xffffffff8;
Static const shortint vmtdestroy = 0xfffffc;
Note that the negative numbers in these constant values are complement indication. Ask a negative complement, first write the complement of the corresponding positive number of complement, then press the reverse, and finally (at the lowest) plus 1. For the complement of 32-bit negative numbers, it can also be subtracted from the 0xffffffFFF and minus 1. Take 0xfffffffcc as an example, 0xffffffffc - 0xfffffffff - 1 = - 0x04, this is the result. We can also get from the original code source / vcl / system.pas provided by Borland, which is represented by a negative number.
Looking at this form, from these variable names, we have guessed the approximate distribution. The intervals between these numbers are [4], which can be suspicious of these pointers: function pointer or data pointer. From the name of these constants, we can know their role, such as VMTClassName nature is a pointer to the class name. Before the entrance 0 is the critical data of the VCL object. Undoubtedly, they contain the key secrets of Tobject and even VCL objects, that is, the distribution structure of VMT.
These are just our speculation, we should also verify it. The fact that we know is that every object necessarily contains information about its belongs. For example, any C class entity contains a pointer to the virtual function table VFT. The entity of the VCL class must also contain a pointer to the virtual method table VMT.
#include
#include
Using namespace std;
Class A: Public Tobject
{
INT X;
Virtual void f1 () {}
Virtual void f2 () {}
PUBLIC:
A (int V = 0): x (v) {}
}
void main ()
{
A * P = new a ;, * q = new a (100);
Void * a = * (void **) P, * b = * (void **) q;
Void * c = p-> classtype (), * d = Q-> ClassType ();
Cout << a << '' << b << endl; cout << c << '' << D << Endl;
COUT << __classid (a) << endl;
Delete P;
Delete q;
}
The result is very interesting, the five pointer addresses of the output are exactly the same! A and B. We can know from the previous example. However, the return value of TOBJECT's CLASSTYPE method and __classid operator is also the same as that, which is a bit meant. You can know that __ classid is the new extended keyword in C Builder, returns the entry address of the VMT of the class; the TOBJECT's CLASSTYPE method is the class information of the object, the return type is TCLASS (that is, Tmetaclass *) . This shows that the pointer to each VCL entity is included in the entry address of the VMT. And this location is the return value of the member function classType of TOBJECT, that is, the information returned by the operator __classid returns, but this return value is in the form of TCLASS (ie Tmetaclass *).
Figure 4 VMT entrance to the VCL class
We already know the structure of the VMT, and now find its entrance. At this time, the excitement is not afraid of "sesame, open the door", the feeling of the spell. Since I know the spell of opening the door, I still don't hurry to take the treasure?
Cow knife trial
Take Dongfeng, let's simulate the simple RTTI function of VCL. For the convenience, we patterly tobject, write a class fObject (huh, if you look to True Object, our fOBJECT is false object). Ask where to ask where this code comes from? Most of the Copy & Paste Since the include / vcl / systobj.h file.
Class FOBJECT
{
PUBLIC:
FOBJECT (); / * body provided by vcl {} * /
FREE ();
Tclass classType ();
Void cleanupinstance ();
Void * Fieldaddress (const shortstring & name);
/ * Class method * /
Static Tobject * InitInstance (TCLASS CLS, VOID * INSTANCE)
Static shortstring classname (TCLASS CLS);
Static Bool ClassNameis (TCLASS CLS, Const Ansistring String);
Static Tclass ClassParent (TCLASS CLS);
Static void * ClassInfo (TCLASS CLS);
Static Long InstanceSize (TCLASS CLS);
Static Bool InheritsFrom (Tclass CLS, TCLASS ACLASS);
Static void * MethodDress (Tclass CLS, Const Shortstring & Name);
Static shortstring methodname (Tclass CLS, Void * Address);
/ * Hack: getInterface is an untyped out object parameter and
* SO is mangled as a void *. in Practice, However, IT IS * really a void **. Be Sure When Using this method to provide
* Two levels of indirection and cast away one of the.
* /
Bool GetInterface (const tguid & iid, / * out * / void * Obj);
/ * Class method * /
Static PinterfaceEntry GetInterfaceEntry (Const Tguid IID);
Static PinterFaceTable * GetInterfaceTable (Void);
Shortstring classname ()
{
Return classname (classType ());
}
Bool ClassNameis (const anstrument string)
{
Return ClassNameis (ClassType () (), String;
}
TCLASS CLASSPARENT ()
{
Return ClassParent (classType ());
}
void * classinfo ()
{
Return ClassInfo (ClassType ());
}
Long instancesize ()
{
Return InstanceSize (ClassType ());
}
Bool InheritsFrom (Tclass ACLASS)
{
Return InheritsFrom (ClassType (), ACLASS;
}
Void * Methodaddress (const shortstring & name)
{
Return Methodaddress (ClasStype (), Name;
}
Shortstring methodname (void * address)
{
Return MethodName (ClassType (), Address;
}
Virtual HRESULT SAFECALLEXCEPTION (TOBJECT *, VOID *);
Virtual void afterconstruction ();
Virtual void beforeDestruction ();
Virtual void dispatch (void * message);
Virtual Void DefaultHandler (Void * message);
Private:
Virtual TOBJECT * NewInstance (TCLASS CLS);
PUBLIC:
Virtual void freeinstance ();
Virtual ~ fObject (); / * body provided by vcl {} * /
}
Of course, fObject :: ClasStype we will have written, that is
Tclass fObject :: classType ()
{
Return * (TCLASS *) THIS;
}
We will fill them completely in the back. First mention an example, take the class name (ClassName).
Check the VMT table, vmtclassname = 0xffffffd4, we will start from here. The main step is:
Find the entrance to the VMT; find the address of the stored class name through VMTClassName; get the class name.
0xffffffd4 is also equivalent to - 44, that is, the address of the VMT entry pointing, the countdown 44th bytes to the pointer represented by the 4-byte of the 4-byte, pointing to the class name. Suppose the address points to the entrance is CLS, then the address represented by VMTClassName is (char *) CLS - 44, ie (char *) CLS VMTClassName. Note that a string format is written by using Object Pascal, where the format of the stored class name is inevitably the PASCAL traditional mode, that is, the first byte is the length of the string, followed by a string The actual content. In C Builder, the type corresponding to SHORTSTRING.
Figure 5 TOBJECT :: ClassName work mode
code show as below:
Shortstring FOBJECT :: ClassName (Tclass CLS)
{
Shortstring * r = * (shortstring **) (CHAR *) CLS VMTClassName);
Return * r;
}
We may wish to test it.
#include
#include
#include
Using namespace std;
... insert FOBJECT corresponding code ...
void main ()
{
Auto_ptr
List (new tlist);
FOBJECT * P = (FOBJECT *) list.get ();
COUT << Ansistring (P-> ClassName ()). c_str () << endl;
COUT << ANSISTRING (List-> ClassName ()). c_str () << ENDL;
}
The output is in our expectation, it is "TLIST".
For functions classnameis, we can easily complete it.
Bool FOBJECT :: ClassNameis (Tclass CLS, Const Ansistring String)
{
Return String == ClassName (CLS);
}
Have friends may be strange, how do you know that TOBJECT :: ClassName is this?
Three ways:
Guess, use experience to speculate; see the original code provided by Borland; read the assembly after compilation.
In the original code provided by Borland, we can see the implementation of TOBJECT:: ClassName as follows:
Class function TOBJECT.CLASSNAME: shortstring;
ASM
{-> EAX VMT}
{Edx Pointer to Result String}
PUSH ESI
Push EDI
MOV EDI, EDX
Mov ESI, [EAX] .VMTClassName
XOR ECX, ECX
MOV CL, [ESI]
Inc ECX
REP MOVSB
POP EDI
POP ESI
END;
Friends who are familiar with compilation can thus write corresponding C / C code. For friends who will not, according to our explanation, I believe it can also be done easily.
I hope that when you are watching this, you may wish to use the first method, then combine 2, 3 to see, you must have a harvest.
Tasting
The next step is too simple, we don't exemplify the corresponding member functions. You may wish to write to write, explore, compare with the code code and the code in the text, must be fun.
What is the TOBJECT :: ClassInfo? Ask me? I do not know either. In the help of VCL, use ClassInfo to access the RTTI table containing object type, ancestors, and all public property information. This table is only internal to use, and TOBJECT provides other methods to access RTTI information. Let's write its implementation. Figure 6 TOBJECT :: ClassInfo work mode
Void * fObject :: ClassInfo (TCLASS CLS)
{
Return * (void **) (CHAR *) CLS VMTTYPEINFO);
}
Does Borland say that? The type of this function returns value is Void *, and the Ming is unwilling to disclose more information. You may wish to test it on the method of classname above. For TLIST, the result of the ClassInfo output is 0! That is, an empty pointer! what is that? Don't worry, then we will open this void * veil, now aunt and sell a Cat.
There is only one inheritance in the VCL frame, which is determined by the characteristics of the Object Pascal language. This way, every class only has only one parent class, the function TOBJECT:: ClassParent can help you find the parent class.
Tclass FOBJECT :: ClassParent (Tclass CLS)
{
Tclass * r = * (tclass **) ((char *) CLS VMTPARENT);
RETURN (R)? (* r): 0;
}
Thus, we can also easily simulate the implementation of Tobject :: inheritsform.
Bool FOBject :: InheritsFrom (Tclass CLS, TCLASS ACLASS)
{
While (ACLASS)
{
IF (ACLASS == CLS) Return True;
CLS = ClassParent (CLS);
}
Return False;
}
To know the number of bytes occupying an object, TOBJECT :: InstanceSize can achieve the purpose.
Long FOBJECT :: InstanceSize (TCLASS CLS)
{
RETURN * (long *) ((char *) CLS VMTINSTANCESIZE);
}
Is there a friend saying that C does not have a SizeOf operator? Why don't you use it? In VCL, SizeOf has two flaws. First, SizeOf is completely static, that is, if you write SizeOf (...), it will be replaced with a constant, there is no request, so it cannot be dynamically evaluated; second, the VCL class must Present with pointers or references. Therefore
TOBJECT * A;
...
SIZEOF (* a) This expression is wrong. And even if TOBJECT is not a VCL class, use sizeof (* a) or corresponding to sizeof (TOBJECT), no actual value.
Conclude
Now we have opened the door to the secret to the VCL class. Looking back, what is the difference between VMT and VFT? In fact, VMT can be a result of VFT avatar, that is, VMT is a "specification" structure based on VFT, and all VCL classes follow this "specification". This is very similar to the relationship between COM and C pure vain.
Through VMT, VCL places some important information, thereby implementing RTTI. So the "advanced" RTTI function is actually a quite low-level and simple technique. For its implementation, there are roughly three. MFC is a class, which is fully compliant with the C standard. It does not need to expand the language, nor does it depend on a specific compiler, but give people bloated feelings; VCL is completely implemented by compiler, while expanding The language characteristics of C must be compiled on the Borland's compiler, but it is very simple; additional KDE basic cross-platform QT [4], uses a compromised way, expands the keyword of C , writing very simple Before compiling, you must use the program MOC provided by Qt to prepare, and the code of the expansion section is rewritten as code that conforms to the C standard, and then compiled on any C standard compiler. Representative implementation does not rely on specific compiler simple level compiling MFC macro plug-in is a general 1VCL compiler to generate whether 1QT pre-compiled program is better 2 (including MOC)
Note: If you have long been developed under the Windows platform, you may have not heard of the name of Qt. In fact, in the world of Linux, this is a well-signed name. Qt is a complete set of C frames, across UNIX / Linux, Windows, Mac OS, and internal mechanisms are quite interesting. Borland's latest Kylix and Delphi6 cross-platform frameclx (divided into four parts of BaseClx, VisualClx, Dataclx, NetClx, BaseClx and several classes at the top of VCL), which considers part of VisualClx constructs on QT, how much I am disappointed and dissatisfied. QT itself across platforms, VisualClx construction on QT, naturally cross-platform; but CLX is packaged with Object Pascal, I don't dare to imagine, whether the CLX in C Builder6 is using C to pack this packaging C framework Object Pascal framework? If so, its efficiency and debugging difficulty ...
For the implementation form of "Advanced" RTTI, VCL uses TMetAclass (where TCLASS is TMetaclass *) to match the store's information, which is the so-called "class", which is very common. This is the case in the MFC's cobject and java.lang.Object and java.lang.bobject and java.lang.class in JDK. For example, for a TOBJECT * P, how to get its parent class name? We must use the TMetaclass: You can return to the parent class with P-> ClassType first (a TMetaClass * type), and then get the result as the parameter into the TOBJECT:: ClassName, you can get the result, that is, TOBJECT: ClassName (P- > ClassType ()).
At the same time, we should also remove the so-called "The language of the language itself with more advanced RTTI is more advanced" lies. At least from my less poor experience, for a set of frames unless you need to cooperate with the IDE, RTTI is completely unnecessary, even harmful [5]. Friends wishing to use and design frame think.
Thank you
I am very grateful to Mengyan and Sun Chunyang to pay attention to this article.
reference
.. "Depth Exploration C Item Model". A & Sifeng Information Co., Ltd. 1998. Houjie. "Depth Exploring C Object Model". Huazhong University of Science and Technology Press. 2001.
3. Hou Jie. "In-depth MFC", 2E. Songgang Computer Graph Information Co., Ltd. / Huazhong University of Science and Technology Press. 1997/2001.
4. Pneese. "Qt latest news". C View. 2001, 7.
5. Robert C.martin. "The open-closed principle". C Report. 1996, 1. PlPliuly, insects. "Open Closed Principle OCP". C View. 2001, 8.