Why can't the C ++ compiler support separation of templates

zhaozj2021-02-16 101

Why can't the C compiler support separation of templates

Liu Weipeng (PONGBA)

C Louvre (http://blog.9cbs.net/pongba)

First, a compilation unit refers to a .cpp file, all .h files it #include it., The code in the .h file will be extended to the .cpp file containing it, then compile the compiler .cpp file is a .Obj file (assuming our platform is WIN32), the latter has PE (portable executable, the Windows executable file) file format, and it is already binary code, but it is not necessarily possible, Because it doesn't guarantee that there must be a main function. When the compiler compiles all the .cpp files in an engineering, the compiler is compiled by the connector (Linker) to become a .exe file.

for example:

//--------------test.h----------------------

Void f (); // This declares a function f

//--------------test.cpp ---------------//

#include "test.h"

Void f ()

{

... // Do Something

} // Here you realize the F function declared in Test.h

//--------------Main.cpp ----------------

#include "test.h"

int main ()

{

f (); // Call F, F has an external connection type

}

In this example, Test. CPP and Main.cpp are each compiled into a different .Obj file (named Test.obj and main.obj), in main.cpp, call the F function, however compiled compiler When main.cpp, it only knows only about the Test.h file included in Main.cpp, the declaration of Void f ();, the compiler will see the F view here as an external connection type, that is, Its function implements the code in another .Obj file, this example is Test.obj, that is, there is actually no binary code about the F function in main.obj, and these code actually exist in Test.cpp Compiled Test.Obj. The call to f in main.obj will only generate a line of Call instructions, like this:

Call F [C This name is of course mangling [handling]

At compile time, this CALL instruction is obviously wrong because there is no line implement code in main.obj. then what should we do? This is the task of the connector, the connector is responsible for looking for F-implementation code in other .Obj (this example is Test.obj), find the calling address of the CALL F this command to actually enter the point address . It should be noted that the connector actually enters the .Obj "connection" into a .exe file, and its most critical task is what to say above, looking for an external connection symbol in another .Obj address, Then replace the original "false" address.

This process If it is deeper, it is:

Call F This command is actually not the case, it is actually a so-called stub, which is a JMP 0xAbcDef. This address may be arbitrary, but the key is that there is a row of instructions on this address to perform real CALL F action. That is, this .Obj file all the calls of F are jmp to the same address, and "call" f is true "Call" in the latter. The advantage of doing this is that when the connector modifies the address, as long as the latter's Call XXX address is changed. However, how the connector finds the actual address of F (here in this example in Test.obj), because .Obj and .exe's format are the same, there is a symbol import table and symbol in such a file. The export table (Import Table and Export Table) is associated with all symbols and their addresses. This allows the connector to look for symbol f (of course, C to f, MANGLING) in Test.Obj, and then do some offset processing (because two .Obj file merge, of course The address will have a certain offset, which is clearly written to the symbol imported in main.obj to import the one occupied in the table. This is a probably process. The key is:

When compiling main.cpp, the compiler does not know the implementation of F, so when it comes to the call to it, it is only given an indication, indicating that the connector should look for F based implementation. This is to say that there is no two binary code about F in the main.obj.

When compiling Test.cpp, the compiler finds the implementation of F. So the implementation of F (binary code) appears in Test.obj.

When connecting, the connector finds F implementation code (binary) address (via symbol export table) in TEST.OBJ. Then change the Call XXX address in Main.obj and change to the actual address. carry out.

However, for templates, you know that the code of the template function does not directly compile into binary code, where there is a "instantiation" process. for example:

//----------main.cpp --------//

Template

Void f (t t)

{}

int main ()

{

... // Do Something

f (10); // Call F The compiler is determined here to give an instance of F

... // DO Other Thing

}

That is, if you have no calls f, f, f, f, so that main.obj does not have any line of binary code about F! If you call this:

f (10); // f is instantiated

f (10.0); // f is instantiated

In this way, there is also a binary code segment of the two functions of F , F in main .Obj. Push it in this class.

However, instantiate require the compiler to know the definition of the template, isn't it?

Look at the example below (separate the declaration and implementation of the template):

//------------test.h-------------------

Template

Class A

{

PUBLIC:

Void f (); // This is just a statement

}

//--------------test.cpp --------------

#include "test.h"

Template

Void a :: f () // Template implementation

{

... // do something}

//--------------Main.cpp ---------------//

#include "test.h"

int main ()

{

A a;

f (); // # 1

}

The compiler does not know the definition of a :: f, because it is not in Test.h, so the compiler has only hopes that it can be found in other. Obj, find a :: F example, in this example is Test.obj, however, is there a binary code of a :: f in the latter? NO! ! ! Because the C standard is clearly indicated that when a template is not used, it should not be instantiated. Is it a :: f in Test.cpp? No! ! So actually test.obj files compiled with Test.cpp About A :: F and a line of binary code, so the connector is dumbfound, I have to give an connection error. However, if you write a function in Test.cpp, you call a :: f, the compiler will be exemplified, because in this point (in test.cpp), the compiler knows the definition of the template Therefore, it can be instantiated, so the address of the symbol of the symbol of the symbol of the Test.Obj is the address of the symbol of the symbol, so the connector can complete the task.

The key is to compile a .CPP file when the compiler is compiled, and does not know the existence of another .cpp file when compiling a .cpp file, nor does it go to find (it will help the connection when you encounter an unreasonable symbol. . This mode is running well without template, but it is dumbfounded when you encounter a template, because the template will only instantiate only when it is needed, so when the compiler only sees the template's statement, it cannot instance This template can only create a symbol with an external connection and expect the connector to resolve the address of the symbol. However, when implementing the template's instance of the template, the compiler is too lazy to instantiate, so the binary code of a line template instance cannot be found throughout the project. The connector is also poor. .

转载请注明原文地址:https://www.9cbs.com/read-24073.html

9cbs

New Post(0)