Analysis of .NET Framework's extension of PE file format
Webcrazy (http://webcrazy.yeah.net)
Microsoft .NET Framework came out of the small array, I also have access to the first time since it Beta 1. This article will begin with a small PE file generated from .NET to understand the extension of the .NET Framework to the PE file format. This extension is to let the Windows system identify Common Language Runtime (CLR).
The PE file is an executable file format for the Windows Series operating system. This article assumes that you have a considerable understanding of this file format, and there is no discussion on the Win16 before the previous WIN16 and the Win64 before. Before the CLR appears, the PE file format is only composed of PE Header and Native Image (relative to the following CLR Header and CLR DATA). Native Image consists of each section, such as .text, .data, .rdate, etc., need to point out that the Section name naming rules for the PE file do not require it to begin with a sentence, in fact, this is just Microsoft's code segment or The default statement of the data segment, other compilers like Borland, they are named CODE, DATA, and more. Native Image contains machine code with compiled corresponding processors.
The PE file extends an additional portion after the CLR appearance, that is, the support portion of the CLR Header composed of the CLR Data. The CLR header is defined by the image_cor20_header structure in the CorhDr.h of the .NET Framework SDK. From Corhdr.h or Image_COR20_HEADER, the full name COM Runtime, which is about to see the development of the .NET Framework, which is related to COM 's origin. In fact, image_cor20_header also has definitions in the Winnt.h of the platform SDK. I check with the Winnt.h, which is released with Windows XP DDK Build 2505, is given the comment for this definition as com 2.0 Header Structure, and in .NET Framework The SDK is modified to CLR 2.0 Header Structure. CLR DATA includes .NET Metadata, Il Method Bodies, and more. Metadata and IL Method are critical terms in .NET. IL is the abbreviation of Microsoft Intermediate Language. She is introduced for .NET cross-platform, cross-language characteristics, has their own instructions. OpCode.def in .NET SDK lists its supported instructions. The thickness of these instructions and Intel's X86 instruction set is also an image, and the double byte specified by prefix is encoded.
The following I will briefly describe the execution process of the PE file generated by the C # compiler through the bottom of this C # Console code. The on disk structure of the PE file. Generation is just a simple output Hi, as shown below:
PUBLIC CLASS APP {
Static public void main (system.string [] args) {
System.console.writeline ("hi");
}
}
We use CSC /out :app.exe app.cs to compile them. The generated PE file is consistent with the PE file generated by the traditional compiler in .NET, and IMAGE_DOS_HEADER, we know that this part is that early DOS can determine this executable file when encountering the PE file format. Executed to DOS. Image_dos_header has a detailed definition in some structures that will talk about in Winnt.h. Windows OS Loader positions next to Image_NT_HEADERS according to the E_LFANew member in Image_DOS_Header. It is defined as follows: type_nt_headers {
DWORD SIGNATURE;
Image_file_header fileheader;
Image_optional_header32 optionalheader;
} Image_nt_headers32, * pimage_nt_headers32;
We know that member addressofentryPoint of Image_Optional_Header32 is the entry of the PE executable file, which is still in an entry in .NET, it should be very well understood. For an Image_flags_ilonly (specified by the member flags of image_cor20_header), as we generated app.exe, this entry is indirectly positioned to the _corexemain function of the app.exe's IMPORT table. _COREXEMAIN corresponds to the exe file, exported by mscoree.dll. Mscoree.dll is located in% Winnt% / System32, which is Microsoft .NET Runtime Execution Engine, which should be pointed out that she is a Native Image, which is responsible for calling the .NET TOKEN. Net Token. Net token. This is the entrance to the true IL language.
The positioning of each section of the Native Image section has many documentation, and Winnt.h has a detailed definition. I only make simple elaboration:
.Text, .data, etc. Section positioning is specified by the DataDirectory member in Image_Optional_Header32. DataDirectory is an image_data_directory array, a number of MAGE_NUMBEROF_DIRECTORY_ENTRIES (current 16). Each DATADIRECTORY function is specified by image_directory_entry_ ***, such as Export, Import, and more. Because image_data_directory consists of VirtualAddress (RVA) with Size, we can easily find the location of these sections. Like these sections, the positioning of the CLR Header is also a DataDirectory specified, which is image_directory_entry_comheader (value 14, .NET Framework SDK V1 Corhdr.h, in Winnt.H of DDK 2505 is image_directory_entry_com_descriptor). Our generated app.exe has the following format:
.
.
.
AddressofentryPoint: 0x000022CE ( 0x10)
.
.
.
DataDirectory [0] - image_directory_entry_export
VirtualAddress: 0x00000000 ( 0x60)
Size: 0x00000000 ( 0x64)
DataDirectory [1] - image_directory_entry_importvirtualaddress: 0x0000227c ( 0x68)
Size: 0x0000004F ( 0x6c)
.
.
.
DataDirectory [14] - Image_directory_ENTRY_COM_DESCRIPTOR
VirtualAddress: 0x00002008 ( 0xD0)
Size: 0x00000048 ( 0xD4)
.
.
.
OK, from DataDirectory [14] We can easily locate the CLR Header. The Cl R header can be merged into other sections for read-only properties. The CLR Header has been mentioned earlier by the image_cor20_header structure.
// CLR 2.0 Header structure.
Typedef struct image_cor20_header
{
//Header Versioning
Ulong CB;
Ushort MajorRuntimeVersion;
Ushort minorruntimeversion;
// Symbol Table and Startup Information
Image_data_directory metadata;
Ulong flags;
Ulong entrypointtoken;
// Binding Information
Image_data_directory resources;
Image_data_directory strongnamesignature;
// regular fixup and binding information
Image_data_directory codeManagertable;
Image_data_directory vTableFixups;
Image_data_directory exportaddresStableJumps;
// precompiled image info (Internal use only - set to zero)
Image_data_directory managementnativeHeader;
Image_cor20_header;
The Flags of this structure is already mentioned above with EntryPointToken. From so much image_data_directory, this definition is very like image_optional_header32, which can understand the essence of the PE file header, which is used to locate .Text, etc., executed by Windows OS Loader. The former is used to locate .NET CLR DATA, such as Metadata, Resources, StrongnameSignature, and more. Different is that image_cor20_header is responsible for calling _corexemain in Mscoree.dll (corresponding to the EXE file) (MSIL language) can be executed by JIT compilation into a machine code).
Although EnRTyPointToken is an entry with AddressofEntryPoint, it has a very big difference. AddressofEnTryPoint is a RVA, directs the execution address (relative to Image Base), which can only point to a local machine code to load NET Runtime (as _corexemain in Mscoree.dll, which can be set to 0). EntryPointToken is just a .NET TOKEN. Token is the only identification of .NET TYPE, is a DWORD value. Its tallest 8bit indicates what kind of token. It is defined by the CORHDR.H. Such as MDTMethodDef is 0x06000000, MDTEvent is 0x14000000, etc., and the remaining 24bit is the only identification of such token. EnrtyPointToken can only be a METHOD, not Event, etc. If app.exe's EnRTyPointTokeno is 0x06000001, it corresponds to Main Method. You can verify using ILDASM.EXE (provided with .NET Framework SDK). The CLR header of app.exe is as follows (only some non-empty fields):
Size: 0x00000048
MajorRuntimeversion: 0x0002
Minorruntimeversion: 0x0000
Metadata
VirtualAddress: 0x0000207c
Size: 0x00000200
Flags: 0x00000001
ComiMage_flags_ilonly
EntryPointToken: 0x06000001
.NET METADATA is specified by Metadata members. Microsoft gives the ILMethod ON Disk organization (image_cor_ilmethod) in Corhdr.h. An example of MetAinfo is also provided with .NET Framework SDK is also used to analyze Metadata. The ASP.NET example of the Class Browser with the QuickStart example is also the .NET Framework very good learning material. MetaInfo uses a regular COM method, while Class Browser uses the .NET Framework's System.Reflection Namespace. QuickStart on .NET, Web Services, Web Forms, XML, etc. QuickStart is really worth QuickStart, .NET seems to be the direction of learning.
Finally, there should be a very self-refreshing feeling for .Net I have just been in contact. This article is just a learning attitude, and the right to learn notes, communicate with you. If you have a mistake in the text or there is a suggestion, please contact TSU00@263.net.