PE file structure learning notes

xiaoxiao2021-03-06  41

PE file structure

Author: Jiang Jiang E-mail: jznsmail@163.netBlog: http: //blog.9cbs.net/jznsmail/QQ: 457283PE layout PE file header (PE Header) PE header inclusion program code, data area size, location , Applicable operating system, stack initial size and other important information. The PE header is not the beginning of the file. The first few units of the file are DOS Stub: a minimal DOS program to output information like "this program cannot bernot". When the Win32 loader maps a PE file to the memory, the first bit unit of the memory mapped file corresponds to the first bit unit of DOS STUB. A real PE header can be found in a structure in the DOS STUB header. PNTHEADER = DOSHEADER DOSHEADER-> E_LFANEW; E_LFANEW is a relative offset, pointing to the real PE header. DOSHEADER is the base address of Image. Note: The memory should be increased upwards, so the additional shift is not reduced. PE Header IMAGE_NT_HEADERS whole, this structure has a DWORD and two substructures: DWORD Signature; IMAGE_FILE_HEADER FileHeader; IMAGE_OPTIONAL_HEADER OptionalHeader; if a NE Signature e_lfanew points instead of the Signature PE represents a Win16 NE executable file, if it is showing a LE Signature VXD documentation. If it is a LX SignaTrue table OS / 2 document.

IMAGE_FILE_HEADER structure is as follows: DWORD Machine; instructions to use the kind of CPU, can be found in Winnt.h in (my header file as defined below) #define IMAGE_FILE_MACHINE_UNKNOWN 0 #define IMAGE_FILE_MACHINE_I386 0x014c // Intel 386. #define IMAGE_FILE_MACHINE_R3000 0x0162 // MIPS little -endian, 0x160 big-endian #define IMAGE_FILE_MACHINE_R4000 0x0166 // MIPS little-endian #define IMAGE_FILE_MACHINE_R10000 0x0168 // MIPS little-endian #define IMAGE_FILE_MACHINE_WCEMIPSV2 0x0169 // MIPS little-endian WCE v2 #define IMAGE_FILE_MACHINE_ALPHA 0x0184 // Alpha_AXP #define IMAGE_FILE_MACHINE_POWERPC 0x01F0 // IBM PowerPC Little-endian #define IMAGE_FILE_MACHINE_SH3 0x01a2 // SH3 little-endian #define IMAGE_FILE_MACHINE_SH3E 0x01a4 // SH3E little-endian #define IMAGE_FILE_MACHINE_SH4 0x01a6 // SH4 little-endian #define IMAGE _FILE_MACHINE_ARM 0x01c0 // ARM Little-Endian #define IMAGE_FILE_MACHINE_THUMB 0x01c2 #define IMAGE_FILE_MACHINE_IA64 0x0200 // Intel 64 #define IMAGE_FILE_MACHINE_MIPS16 0x0266 // MIPS #define IMAGE_FILE_MACHINE_MIPSFPU 0x0366 // MIPS #define IMAGE_FILE_MACHINE_MIPSFPU16 0x0466 // MIPS #define IMAGE_FILE_MACHINE_ALPHA64 0x0284 // ALPHA64 #define Image_file_machine_axp64 image_file_machine_alpha64 Word NumberOfsections; EXE 's OBJ's sections number DWORD TIMEDATESTAMP; the connector generates this file. Since December 31, 1969, the total second after 4:00 P.m. DWORD POINTOROSYMBOLTABLE; the offset position of the COFF symbol table. Only useful for COFF amplifies.

DWORD NUMBEROFSYMBOLS; the number of symbols in the COFF symbol table. DWORD SIZEOFOPTIONALHEADER; a can have a canable head size, in the EXE file, this is the size of Image_Optional_Header. Most of the time in the OBJ file is 0. Word Characteristics; describes the nature of the file. The more important characteristics are as follows: 0x0001 0x0002 files are not relocatable file is an executable file 0x2000 file is the nature of all dynamic link libraries listed below the system-defined: #define IMAGE_FILE_RELOCS_STRIPPED 0x0001 // Relocation info stripped from file #define IMAGE_FILE_EXECUTABLE_IMAGE 0x0002 //. File is executable (ie no unresolved externel references). #define IMAGE_FILE_LINE_NUMS_STRIPPED 0x0004 // Line nunbers stripped from file. #define IMAGE_FILE_LOCAL_SYMS_STRIPPED 0x0008 // Local symbols stripped from file. #define IMAGE_FILE_AGGRESIVE_WS_TRIM 0x0010 // Agressively trim working set #define IMAGE_FILE_LARGE_ADDRESS_AWARE 0x0020 / / App can handle>

2gb addresses #define IMAGE_FILE_BYTES_REVERSED_LO 0x0080 // Bytes of machine word are reversed. #Define IMAGE_FILE_32BIT_MACHINE 0x0100 // 32 bit word machine. #Define IMAGE_FILE_DEBUG_STRIPPED 0x0200 // Debugging info stripped from file in .DBG file #define IMAGE_FILE_REMOVABLE_RUN_FROM_SWAP 0x0400 // If Image is on removable media, copy and run from the swap file. #define IMAGE_FILE_NET_RUN_FROM_SWAP 0x0800 // If Image is on Net, copy and run from the swap file. #define IMAGE_FILE_SYSTEM 0x1000 // System File. #define IMAGE_FILE_DLL 0x2000 // File is a DLL. #define IMAGE_FILE_UP_SYSTEM_ONLY 0x4000 // File should only be run on a UP machine #define IMAGE_FILE_BYTES_REVERSED_HI 0x8000 // Bytes of machine word are reversed. IMAGE_OPTINAL_HEADER structure It is IMAGE_FILE_ Some additional information outside the Header. Word Magic; Defines the status of Image. If 0x0107 represents a ROM image, 0x010b represents a normal EXE Image. The connector version of the Byte Majorlinkerversion Byte MinorLinkerversion PE file is represented by 10. Other information can be found at the Winnt.h header section section contains each section information of Image. Section is arranged in the starting position rather than alphabetical order. Each area of ​​the section Table stores an address, the original data of the file is mapped to memory. Sections is a memory range. Any Code and Data required by the program and operating system have a corresponding section store. The PE header final is an image_section_header structure array that records a number of elements in Image_NT_Header.FileHeader.NumberofSection. Image_section_header is the complete information of an EXE file or OBJ file section. Byte name [image_sizeof_short_name] is an 8-bit ANSI name (no NULL end value) indicating the section name (such as .text).

Union {dWord PhysicalAddress; DWORD Virtualsize;} Misc; represents the virtual memory size of Code Section or Data Section in the EXE file (not Alignment). For the OBJ file represents the actual address of the section. The first section starts from 0. The next stepion start address plus the SIZEOFRAWDATA value (adjusted section virtual memory size). DWORD VirtualAddress; represents the virtual address mapped by the load program in EXE. The true start address of the section is the base address plus the base address. This address is often set to 0x1000 by the compiler. There is no meaning in the OBJ file, always 0. DWORD SIZEOFRAWDATA; represents the value after the section size is alignment in the exe file. In the OBJ, it means the real section size of the compiled specified. DWORD POINTERTORAWDATA; from the offset starting from the file header, the initial information of section can be obtained from this location. DWORD POINTERTORELOCATIONS; there is no meaning in EXE, total 0. This is the offset starting from the file header to the section information of the section. The relocation information of each OBJ Section is followed by section information. DWORD POINTERTOLINENUMBERS; the offset address of the line number table. In the EXE file, the line number information is placed at the end of the file. The line number table in the OBJ file is placed in the original data of each section and the relocation table. Word NumberOfrelocations; Relocation Table Number (PointertoreLocations pointing). Only for the OBJ file. Word Numberoflinenumbers; line number of the line number table (pointing by PointertolinenumBers). DWord Characteristics; a set of logo, indicating attributes in the section (such as Code or Data readable, writable, etc.). You can see the definition of image_scn_xxx_xxx in Winnt.h. Sections .Text Section contains all general program code. In addition to the code generated by the compiler in .Text, there are some things outside the code of Runtime Library. In the PE file, when you call functions in another set of modules (such as getMessage in user32.dll), the CALL instruction generated by the compiler does not pass the control directly to the function in the DLL, but passed to a JMP. DWORD PTR [xxxxxxxx] The instruction is also in .text. The JMP instruction jumps to a DWORD in .idata. This DWORD contains a true function entry address. For example: Why do you have to pass this way for the DLL call mode? By focusing on all calls to the same DLL function, the loader does not need to modify each call DLL command, just put the true address of the DLL function in the DWord of. IDATA. This calls bring a shortcomings, that is, you cannot initialize a variable with the true address of the DLL function. For example: FarProc PfNgetMessage = getMessage; This variable actually stores the JMP DWORD PTR [xxxxxxx] command address, not the function address that actually needs to be called.

For the API function modified with __Declspec (DLLIMPORT), the compiler does not generate JMP DWORD PTR [XXXXXXXX] instructions, but generates Call DWORD PTR [xxxxxxxx] to call XXXXXXXX in .idata. .DATA section stores the initialization information. Includes global variables, string constants and static variables, which give initial values ​​in the compilation period. The connector combines all the .DATAs in Obj and Lib to exe in the.data. The variable is placed in the stack during the execution process. .BSS Section stores any unmelted static variables and global variables. All.bsss in OBJ and LIB are combined in .BSS in the exe file. In section Table. BSS RawDataOffset is always 0, indicating that this section does not occupy any space. .CRT Section Microsoft C / C Runtime Library (CRT) another initialized Data Section. This is stored here for the constructor of the static C class before main or WinMAN. .Rsrc section stores the resources of the module. Such as .RES file content. .Idata section enables information about modules from other DLLS input (import) functions and information. .Edata section stores information about the PE file output function. Usually only it is seen in DLL. .Reloc Section stores an Base Relocation array. Base Relocation is a set of instructions or initialization variables. This adjustment must be made if the loader doesn't have a way to load the exe or DLL into a preset address. .Tls section When using __declspec (thread), the defined information is not placed in .data or .bss, there is a copy to .tls. The full name of .tls is Thread Local Storage. Each thread can have its own set of static information, using the program code of these information, no need to be that thread is being executed. Assume that a program has the same work. If a STL is declared as: __Declspec (thread) INT i = 0; // this is a global variable declassion each thread will therefore have a copy of the variable I. .RData Section has at least 4 purposes: 1. In the exe generated using the Microsoft Connection program ,.rdata contains Debug Directory (IN OBJ). 2. If you specify a description in the program's .def file, the specified string will appear in .rdata. 3.GUID value is placed in EXE or DLL. RDATA 4. Place the Directory of TLS (THREAD LOCAL STORAGE). Runtime library by the compiler. .DRECTVE Section only appears in the OBJ file, the text description of the connector command parameters. The input of the PE file is stored in the PE file before being loaded into the memory. The information in the PE file is to the load program to determine the function address and correct them in order to complete the image. After being loaded, the iData connotation is a pointer, pointing to the input function of the EXE / DLL. .Idata section starts with an image_import_descriptior array, and the DLL connected by the PE file will have a corresponding image_import_descriptor structure.

转载请注明原文地址:https://www.9cbs.com/read-56584.html

New Post(0)