PE learning notes
The meaning of PE is Portable Executable (portable executive). The overall level of the PE file structure: -------------- | DOS MZ Header || -------------- || DOS stub || ------------- || pehader || -------------- || section Table || ----------- --- || Section 1 || ---------------------------- || section ... | | -------------- || Section N | -------------- One, PE file format
1.1, DOS MZ HEADER: All PE files (or even 32-bit DLLs) must start with a simple DOS MZ HEADER. With it, once the program is executed under DOS, DOS can identify this is a valid executive, and then run the DOS STUB following MZ Header.
1.2, DOS Stub: DOS Stub: DOS Stub (stub) is actually a valid MS-DOS .EXE or .com program (if the file format is not reported), it will be interrupted by simple calls without supporting the operating system in PE file format. 21h Service 9 to display string "this program cannot run in dos mode" or complete DOS code in accordance with programmers's own intentions. Its size is generally not determined. This program can be replaced with the Linker's / Stub: FileName option.
1.3, Pe Header: Turp in dos stub is Pe Header. PE Header is an abbreviation for the PE-related structure image_nt_headers, which contains many important domains used in the PE loader. When executed in the operating system that supports the PE file structure, the PE loader will find the start offset of the PE Header from the DOS MZ header (Image_DOS_HEADER). Therefore, the DOS STUB is jumped directly to the real file header PE Header.
1.4, Section Table: Pe Header Next Array Structure Section Table. If there are 5 sections in the PE file, there are 5 members in this section Table structure, each member contains the properties of the corresponding section, file offset, virtual offset, and the like.
1.5, Sections: The true content of the PE file is divided into blocks, called Section (section). The names of each standard festival are starting with a dot. Sections is arranged in their starting order instead of being arranged in its alphabetical order. The following is a common sectional name and role: the section function. ALPHA Architecture Information. BSS unmelted data. CRT C running read-only data.Data has initialized data. DEBUG debugging information. DIDATA Delay input file name table. Edata Export file name. IData Import file name. Pdata exception information .Rdata read-only initialization data .RSRC resource .Text .EXE or .DLL file can be executed The local memory of the .TLS thread. The division of xData exception handling sections is based on the common attributes of each group of data, not logical concepts. Each section is a piece of data with common attributes, such as code / data, read / write, etc. If the data / code in the PE file has the same properties, they can be classified into the same section. The name is just a symbol of different sections. Similar to "Data", "CODE" naming is only for ease of identification, only feature settings determine the characteristics and functions of the section. 1.6, the main steps of loading a PE file:
1. When the PE file is executed, the PE loader checks the PE HEADER offset in the DOS MZ HEADER. If you find it, you jump to the pe header. 2. PE loader checks the validity of the PE HEADER. If it is effective, jump to the tail of Pe Header. 3. Keeping with Pe Header is a knot. The PE loader reads the section information, and uses the file mapping method to map these sections to the memory, and pay the feature specified in the section table. 4. The PE file is mapped to the memory, the PE loader will process the Logical part of the Import Table in the PE file.
Second, DOS MZ HEADER and PE HEADER
2.1, DOS MZ HEADER Defines the structure Image_DOS_HEADER (64 bytes). The structure is defined as follows:
typedef struct _IMAGE_DOS_HEADER {// DOS .EXE Header WORD e_magic; // Magic number WORD e_cblp; // Bytes on last page of file WORD e_cp; // Pages in file WORD e_crlc; // Relocations WORD e_cparhdr; // Size of Header in paragraphs WORD e_minalloc; // Minimum extra paragraphs needed WORD e_maxalloc; // Maximum extra paragraphs needed WORD e_ss; // Initial (relative) SS value WORD e_sp; // Initial SP value WORD e_csum; // Checksum WORD e_ip; // Initial IP Value Word E_CS; // Initial (Relative) CS Value Word E_LFARLC; // File Address of Relocation Table Word E_ovno; // Overlay Number Word E_RES [4]; // RESER ved words WORD e_oemid; // OEM identifier (for e_oeminfo) WORD e_oeminfo; // OEM information; e_oemid specific WORD e_res2 [10]; // Reserved words LONG e_lfanew; // File address of new exe Header} IMAGE_DOS_HEADER, * PIMAGE_DOS_HEADER; The E_LFANEW member of the image_dos_header structure is RVA to PE Header. e_magic contains a string "MZ". 2.2, PE Header is actually an Image_NT_Headers structure. Defined as follows:
Typedef struct _image_nt_headers {dword signature; image_file_header filehead; image_optional_header optioner;} image_nt_headers, * pimage_nt_headers;
Image_nt_headers Structure Members:
1.Signature: A dword type, a value of 50h, 45h, 00h, 00h (PE / 0/0). If image_nt_headers' signature domain value is equal to "PE / 0/0", it is a valid PE file. Microsoft defines constants IMAGE_NT_SIGNATURE for our use, defined as follows: #define IMAGE_DOS_SIGNATURE 0x5A4D // MZ # define IMAGE_OS2_SIGNATURE 0x454E // NE # define IMAGE_OS2_SIGNATURE_LE 0x454C // LE # define IMAGE_VXD_SIGNATURE 0x454C // LE # define IMAGE_NT_SIGNATURE 0x00004550 // PE00
2.FileHeader: This domain contains information about the physical distribution of PE files, such as the number of days, file execution machines, and the like.
3.OptionalHeader: This domain contains information about the logical distribution of PE files, although the domain name is "optional", but in fact this structure always exists.
2.3, the validity step of checking the PE file is summarized as follows:
1. First verify that the value of the first word of the file header is equal to Image_DOS_SIGNATURE, which is the DOS MZ HEADER is valid. 2. Once the DOS Header of the document is proved, you can locate the Pe Header with E_LFANEW. 3. Compare the value of the first word of PE Header equal to Image_NT_Header. If the two values before and after match, then we think that the file is a valid PE file.
The validity of the PE file will be checked by an example of VC 6.0:
We first call open file common dialog box (GetopenFileName), select Open a file and map to memory (CreateFile, CreateFilemApping, MapViewOffile, etc.) to get the target file size (m_buffer = new unsigned char [m_size];). Then get the first 2 bytes of the target file ((unsigned short *) m_buffer [0];), see if "MZ". If the same, the location of the target file PE Header is obtained ((unsigned int *) (2 * m_buffer 0x3c));), compared to 0x00004550 (PE). This verifies the effectiveness of PE.
Third, File Header (file header)
File header (image_file_header) is included in PE Header (Image_NT_HEADERS), its structure definition:
typedef struct _IMAGE_FILE_HEADER {WORD Machine; WORD NumberOfSections; DWORD TimeDateStamp; DWORD PointerToSymbolTable; DWORD NumberOfSymbols; WORD SizeOfOptionalHeader; WORD Characteristics;} IMAGE_FILE_HEADER, * PIMAGE_FILE_HEADER;
Image_file_header Structure Members:
1.Machine: This file runs the required CPU. For Intel platform, this value is image_file_machine_i386 (14ch). We tried LuevelSmeyer's PE.TXT declaration of 14DH and 14EH, but Windows did not execute correctly. Some CPU identification code definitions:
Intel i386 0x14cintel i860 0x14dmips R300 0x162MIPS R400 0x166Dec alpha AXP 0x184Power PC 0x1f0 (Little Endian) Motorola 68000 0x268PA RISC 0x290 (Precision Architecture)
#define IMAGE_FILE_MACHINE_UNKNOWN 0 # define IMAGE_FILE_MACHINE_I386 0x014c // Intel 386. # define IMAGE_FILE_MACHINE_R3000 0x0162 // MIPS little-endian, 0x160 big-endian # define IMAGE_FILE_MACHINE_R4000 0x0166 // MIPS little-endian # define IMAGE_FILE_MACHINE_R10000 0x0168 // MIPS little-endian # define IMAGE_FILE_MACHINE_WCEMIPSV2 0x0169 // MIPS little-endian WCE v2 # define IMAGE_FILE_MACHINE_ALPHA 0x0184 // Alpha_AXP # define IMAGE_FILE_MACHINE_POWERPC 0x01F0 // IBM PowerPC Little-endian # define IMAGE_FILE_MACHINE_SH3 0x01a2 // SH3 little-endian # define IMAGE_FILE_MACHINE_SH3E 0x01a4 // SH3E little-endian # define IMAGE_FILE_MACHINE_SH4 0x01a6 // SH4 little-endian # define IMAGE_FILE_MACHINE_ARM 0x01c0 // ARM Little-endian # define IMAGE_FILE_MACHINE_THUMB 0x01c2 # define IMAGE_FILE_MACHINE_IA64 0x0200 // Intel 64 # define IMAGE _FILE_MACHINE_MIPS16 0x0266 // MIPS # define IMAGE_FILE_MACHINE_MIPSFPU 0x0366 // MIPS # define IMAGE_FILE_MACHINE_MIPSFPU16 0x0466 // MIPS # define IMAGE_FILE_MACHINE_ALPHA64 0x0284 // ALPHA64 # define IMAGE_FILE_MACHINE_AXP64 IMAGE_FILE_MACHINE_ALPHA64
2.NumberOfsections: The number of files for files. If we want to add or delete a section in the file, you need to modify this value.
3.TimedateStamp: File creates date and time. The format is the total number of seconds since 4:00 on December 31, 1969. According to I calculate, 0xffffffh is 136.1925195015220700152207001521.
4.Pointertosymboltable: Coff Sign Table Offset Location. This domain is only useful to the COFF informably information.
5.NumberOfSymbols: The number of symbols in the COFF symbol table.
6.SizeOfoptionalHeade: Indicates that the OPTIONAL header (image_optional_header) structure after this structure must be a valid value.
7.chractics: Marking on this document. Some important properties are as follows:
The 0x0001 file is not relocating 0x0002 file is an executable program EXE (that is, the OBJ or LIB) 0x2000 file is DLL, not Exe.
In general, if you want to traverse the chart, you have to use NumberOfSections, and the other few domains do not work.
Four Optional Header
4.1, RVA and related concepts:
RAV represents relative virtual addresses. RVA is a distance from the virtual space to the reference point. RVA is something similar to file offset. Of course it is an address in the virtual space, not the file header. For example, if the PE file is loaded into the 400000H of the virtual address (VA) space, and the process starts from the unique 401000h, we can say that the process performs the start address in RVA 1000h. Each RVA is a starting VA relative to the module. Affine Site (VA) 0x401000H - Attachment (BA) 0x400000H = RVA 0x1464H. The base address is used to describe the starting position of the EXE or DLL that is mapped to the memory.
Why is the PE file format to use RVA? This is to reduce the burden on the PE loader. Because each module is likely to be overloaded to any virtual address space, this is definitely a dream if the PE loader corrects each repositioning item. Conversely, if all relocation items use RVA, then the PE loader does not have to worry about those things: it will relocate the entire module to the new start VA. This is like the concept of relative path and absolute path: RVA is similar to the relative path, and VA is like an absolute path.
Most of the addresses in the PE file are mostly RVAS and RVAS is only meaningful when the PE file is loaded by the PE loader. If the files are mapped directly to memory instead of by PE loading, those RVAs cannot be used directly. Those RVAs must be converted into file offset.
4.2, the Optional HEADER structure is the last member of Image_NT_Headers. The logical distribution information of the PE file is included. This structure has a total of 31 domains, some are critical, and others are less common. Its structural definition:
typedef struct _IMAGE_OPTIONAL_HEADER {WORD Magic; BYTE MajorLinkerVersion; BYTE MinorLinkerVersion; DWORD SizeOfCode; DWORD SizeOfInitializedData; DWORD SizeOfUninitializedData; DWORD AddressOfEntryPoint; DWORD BaseOfCode; DWORD BaseOfData; DWORD ImageBase; DWORD SectionAlignment; DWORD FileAlignment; WORD MajorOperatingSystemVersion; WORD MinorOperatingSystemVersion; WORD MajorImageVersion; WORD MinorImageVersion; WORD MajorSubsystemVersion; WORD MinorSubsystemVersion; DWORD Win32VersionValue; DWORD SizeOfImage; DWORD SizeOfHeaders; DWORD CheckSum; WORD Subsystem; WORD DllCharacteristics; DWORD SizeOfStackReserve; DWORD SizeOfStackCommit; DWORD SizeOfHeapReserve; DWORD SizeOfHeapCommit; DWORD LoaderFlags; DWORD NumberOfRvaAndSizes; IMAGE_DATA_DIRECTORY DataDirectory [IMAGE_NUMBEROF_DIRECTORY_ENTRIES]; Image_optional_head ER, * pimage_optional_header; image_optional_header Member Meaning: 1.magic: The status used to define image
0x0107 (image_rom_optional_hdr_magic): A rom image0x010b (image_nt_optional_hdr_magic): A normal (general) EXE Image. The large part of the PE file contains this value.
2.majorlinkerversion, MinorLinkerversion: The version of the linker that produces this PE file. Expressed in decimal instead of heteo. For example, version 2.23.
3.SizeOfcode: The sum size of all Code Section. Most programs have only one code section, so this domain is usually .Text Section size. 4.SizeOfinitializedData: All SECTIONS (but excluding code sections) containing the initialization content. It doesn't seem to include Initialized Data Sections.
5.SizeOFunInitializedData: All the Sum of the Summer Size of all the SECTIONS that requires the PE load to give it a memory address space but does not occupy all the SEctions of the hard disk space. These Sections do not require special content when the program starts, so that uninitialized data is called. The content of the initialization is usually placed in .bss section.
6.AddressofEntryPoint: This is the location where the PE file is started. This is a RVA, which usually falls in .Text Section. This domain is suitable for EXE or DLL. 7.BaseOfcode: An RVA, why starting the code section in the program. Code Section is usually behind the PE header before Data Section. This value is usually 0x1000. The TLINK32 of Borland usually specifies this value of 0x10000. Because of the preset case, TLINK is aligned with granularity, and MS is 4K.
8.BaseOfData: A RVA, indicating where Data Section in the program starts. Data Section is typically located after the Code Section and PE headers. 9.Imagebase: PE file priority loading address. For example, if the value is 400000H, the PE loader will attempt to install the file to the 400000H of the virtual address space. Words "Priority" indicates that if the address area is occupied by other modules, the PE loader selects other idle addresses.
10. SectionAlignment: The particle size of the memory middle section aligned. For example, if the value is 4096 (1000H), then the start address of each section must be a multiple of 4096. If the first section starts from 401000h and the size is 10 bytes, the next section must start from 402000 h, even if there is still a lot of space between 401000 h and 402000h, there is no use.
11.FileAlignment: The particle size of the file in the file is aligned. For example, if the value is (200h), then the start address of each section must be a multiple of 512. If the first section starts from the file offset 200h and the size is 10 bytes, the next section must be positioned in the offset 400h, even if there is still a lot of space between the offset 512 and 1024, there is no use or definition. The preset value is 0x200h.
12.majorOperatingSystemVersion / MinorOperatingSystemVersion: The smallest version of the operating system for this executable program. These two domains of the Win32 program are usually specified as 1.0.
13.majorsubsystemversion / minorsubsystemversion: Win32 subsystem version. If the PE file is designed for Win32, the subsystem version must be 4.0 otherwise the dialog will not have 3 dimensionality.
14.majorImageVersion / MinorImageVersion: Users custom domain, allow you to have different versions of EXE or DLL. You can set its value using the / version option of the linker. For example: Link /Version: 2.0 myobj.obj.
15.ReServed1: It seems to always 0.
16.SizeOfImage: Size of the entire PE image in memory. It is the size of the head and section passing through the section. That is, starting from Image Base until the last section. The last step must be a multiple of SectionAlignment. 17.SizeOfHeaders: The size of all heads sessions is equal to the size of the file size minus the size of all sections in the file. You can use this value as the file offset of the first section of the PE file.
18.Checksum: A CRC checksum for this program. This domain is usually ignored and set to 0. However, all Driver DLLs, all DLLs loaded when booting, and Server DLLs must have a legitimate Checksum. Its performance algorithm can be obtained in ImageHLP.DLL. Imagehlp.dll's code can be found in Win32 SDK.
19.Subsystem: Which subsystem is used to identify PE files. For most Win32 programs, there are only two types of values: Windows GUI and Windows CUI. WINNT.h are defined as follows:. #Define IMAGE_SUBSYSTEM_UNKNOWN 0 Unknown subsystem # define IMAGE_SUBSYSTEM_NATIVE 1 does not require subsystems (e.g., driver) #define IMAGE_SUBSYSTEM_WINDOWS_GUI 2 running Windows GUI subsystem #define IMAGE_SUBSYSTEM_WINDOWS_CUI 3 running in Windows character mode subsystem (Also console app) #define image_subsystem_os2_cui 5 run in the OS / 2 character mode subsystem (os / 2 1.x application) #define image_subsystem_posix_cui 7 run #define image_subsystem_native_windows 8 in the POSIX character mode subsystem Win9X driver #define image_subsystem_windows_ce_gui 9 runs in the Win CE subsystem
20. Dllcharacteristics: A set of marker, which is used to indicate that the DLL initialization function (such as a dllmain) is called in what environment. This value is always 0, but the operating system will call the DLL initialization function in four cases. The four values of this value are as follows:
0x0001: 0x0002 when the DLL is loaded into the address space of a process: 0x0004 when a thread ends: But a thread starts 0x0008: When the DLL exits 0x2000: a WDM driver
21.SizeOfStackReServe: The retention size of the initial stack of threads. However, not all these memory are specified by the system. This value is preset to 0x100000 (1MB). If you call CreateThread in your program and specify its stack size 0, the thread obtained has a stack that is the same as this value.
22.SizeOfStackCommit: The number of memory that is specified to execute the initial stack of threads at the beginning. Microsoft's linker presets this value of 0x1000 (a page), Borland's TLINK32 is set to 0x2000 (two PAGE).
23.SizeOfheapReServe: The number of virtual memory reserved to the initial process heap. The handle of this pile can be obtained using getProcessHeap. Not all of these memory are specified.
24.SizeOfheapCommit: The amount of memory that is assigned to the process heap at the beginning. This value is preset to 0x1000 byte (bit group).
25.Loaderflags: Debug. Possible effects: a. In the beginning of this process, an interrupt is initiated? b. Is it executed after the process is loaded?
26.NumberofrvaAndsizes: The number of membership structures in the DataDirectory array. The current tool always sets this value to 16.
27.DATADIRECTORY [image_numberof_directory_entries]: an image_data_directory structure array. Each structure gives an important data structure RVA. The first element of the array represents the address and size of the exported function table, the second element represents the address and size of the Imported Function Table, and so on. The following is a complete list of the order: // Directory Entries # define IMAGE_DIRECTORY_ENTRY_EXPORT 0 // Export Directory # define IMAGE_DIRECTORY_ENTRY_IMPORT 1 // Import Directory # define IMAGE_DIRECTORY_ENTRY_RESOURCE 2 // Resource Directory # define IMAGE_DIRECTORY_ENTRY_EXCEPTION 3 // Exception Directory # define IMAGE_DIRECTORY_ENTRY_SECURITY 4 // Security Directory # define IMAGE_DIRECTORY_ENTRY_BASERELOC 5 // Base Relocation Table # define IMAGE_DIRECTORY_ENTRY_DEBUG 6 // Debug Directory # define IMAGE_DIRECTORY_ENTRY_COPYRIGHT 7 // Description String # define IMAGE_DIRECTORY_ENTRY_GLOBALPTR 8 // Machine Value (MIPS GP) #define IMAGE_DIRECTORY_ENTRY_TLS 9 // TLS Directory # define IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG 10 // Load Configuration Directory # define image_directory_entry_bound_import 11 // Bound Import Directory In Headers # define image_directory_entry_iat 12 // Import Address Table
96/112 8 Export Table Export Table address and size. 104/120 8 Import Table Import Table address and size 112/128 8 Resource Table Resource Table address and size. 120/136 8 Exception Table Exception Table address and size. 128/144 8 Certificate Table Attribute Certificate Table address and size. 136/152 8 Base Relocation Table Base Relocation Table address and size. 144/160 8 Debug Debug data starting address and size. 152/168 8 Architecture Architecture-specific data address and size. 160 / 176 8 Global Ptr Relative virtual address of the value to be stored in the global pointer register. size member of this structure must be set to 0. 168/184 8 TLS Table Thread Local Storage (TLS) Table address and size. 176 / 192 8 Load Config Table Load Configuration Table address and size. 184/200 8 Bound Import Bound Import Table address and size. 192/208 8 IAT Import address Table address and size. 200/216 8 Delay Import Descriptor address and size of the Delay Import d Escriptor. 208/224 8 COM Runtime Header Com Runtime Header Address and Size 216/232 8 Reservedrivershan Original in 2003.1.18