5. Section Table (Section Table)
The section table is an array of structures next to PE Header. The number of members of the array is determined by the domain value of the NumberOfSections domain in the file header (Image_file_Header). The group table member structure is named image_section_header (forty). Its structural definition:
typedef struct _IMAGE_SECTION_HEADER {BYTE Name [IMAGE_SIZEOF_SHORT_NAME]; union {DWORD PhysicalAddress; DWORD VirtualSize;} Misc; DWORD VirtualAddress; DWORD SizeOfRawData; DWORD PointerToRawData; DWORD PointerToRelocations; DWORD PointerToLinenumbers; WORD NumberOfRelocations; WORD NumberOfLinenumbers; DWORD Characteristics;} IMAGE_SECTION_HEADER, * PIMAGE_SECTION_HEADER ;
Image_section_header structure member meaning:
1.Image_sizeof_short_name: No more than 8 bytes of section. The name is only a tag, we choose any name or even empty, and cannot end with NULL. Name is not an ASCIIZ string, so you don't have to end with NULL.
2.PhysicalAddress: Specify the file address.
3.Virtualsize: The meaning of this domain is related to the type of program. If EXE, the sum of the sizes after being loaded into the memory, which is the size before they are adjusted to the nearest file alignment granularity. Later SizeOfrawData is the size of the adjustment. This domain is meaningless for the OBJ document.
4.VirtualAddress: RVA in this section (relative virtual address). The PE loader reads this value when it is mapped to memory, so if the domain value is 1000h, the PE file is installed at 400000H, then this section is contained in 401000h. Microsoft sets the first section of this domain value to 0x1000H. For the OBJ document, this domain is meaningless, always 0. 5.SizeOfrawData: After the file is aligned, the post size, the PE loader extracts the domain value to understand the number of person-by-segments that need to be mapped into the memory. Suppose a file logarithm size is 0x200, if the previous VirtualSize domain indicates that the length is 0x388 bytes, the domain value is 0x400, indicating that this section is 0x400 byte length. In Obj, this is the real section size specified by the compiler or group.
6.PointertorawData: This is the file based on the file based on this section, and the PE loader finds the location of the data in the file via this domain value. If you should have a PE program (instead of loading the operating system loader), you must find this section according to this value, not based on the RVA value in VirtualAddress.
7.PoinTerToreLocations: In OBJS, this is the offset starting with the program, used to point to the section of the section. The relocation information of each OBJ Section is followed by section information. In EXES, this domain (a note next domain) does not make sense, always 0. However, the linker generates an EXE that determines most of the record (fixups), only the base address of the relocation address, and the replacement address of the imported function, and remains in the restoration. Two the same information is placed among the Base Relocation Section section, so Exes does not need to be relocated after each section. 8.Pointertolinenumbers: The offset of the line number table (based on the program). The line number table is related to the source code line number and the location thereof that is mapped to the memory. In the exe file, the line number information is placed at the end of the program. If there is no Coff, set to 0.
9.NumberOfrelocations: A number of repositioning items in the repositioning table (PointerToreLocations point). This domain is only used in OBJ. 0 is 0 in EXE.
10.Numberoflinenumbers: The number of line numbers in the line number form (pointed by PointertolinenumBers).
11.Characteristics: Contains tags to indicate feature properties, such as whether the section contains executable code, initialization data, not initial data, whether it can be writable, readable, and so on. Here are some tags:
IMAGE_SCN_TYPE_REG Reserved. IMAGE_SCN_TYPE_DSECT Reserved. IMAGE_SCN_TYPE_NOLOAD Reserved. IMAGE_SCN_TYPE_GROUP Reserved. IMAGE_SCN_TYPE_NO_PAD Reserved. IMAGE_SCN_TYPE_COPY Reserved. IMAGE_SCN_CNT_CODE Section contains executable code. IMAGE_SCN_CNT_INITIALIZED_DATA Section contains initialized data. IMAGE_SCN_CNT_UNINITIALIZED_DATA Section contains uninitialized data. IMAGE_SCN_LNK_OTHER Reserved. IMAGE_SCN_LNK_INFO Reserved. IMAGE_SCN_TYPE_OVER Reserved. IMAGE_SCN_LNK_COMDAT Section contains COMDAT data . IMAGE_SCN_MEM_FARDATA Reserved. IMAGE_SCN_MEM_PURGEABLE Reserved. IMAGE_SCN_MEM_16BIT Reserved. IMAGE_SCN_MEM_LOCKED Reserved. IMAGE_SCN_MEM_PRELOAD Reserved. IMAGE_SCN_ALIGN_1BYTES Align data on a 1-byte boundary. IMAGE_SCN_ALIGN_2BYTES Align data on a 2-byte boundary. IMAGE_SCN_ALIGN_4BYTES Align data on a 4-byte boundary. IMAGE_SCN_ALIGN_8BYTES Align data on A 8-byte boundary. Image_scn_alig N_16BYTES Align data on a 16-byte boundary. IMAGE_SCN_ALIGN_32BYTES Align data on a 32-byte boundary. IMAGE_SCN_ALIGN_64BYTES Align data on a 64-byte boundary. IMAGE_SCN_LNK_NRELOC_OVFL Section contains extended relocations. IMAGE_SCN_MEM_DISCARDABLE Section can be discarded as needed. IMAGE_SCN_MEM_NOT_CACHED Section can not be cached. . IMAGE_SCN_MEM_NOT_PAGED section can not be paged IMAGE_SCN_MEM_SHARED section can be shared in memory IMAGE_SCN_MEM_EXECUTE section can be executed as code IMAGE_SCN_MEM_READ section can be read IMAGE_SCN_MEM_WRITE section can be written to step traverse section of the table....:
1. PE file validity check. 2. Locate the starting address of Pe Header. 3. Get the number of days from the NumberOfSections domain of File HEADER. 4. Positioning the section table in two ways: ImageBase SizeOfheaders or the start address of PE Header Pe Header structure. (The knot is followed by Pe Header). If not using a file mapping method, you can use setFilePointer to locate the file pointer to the section table. The file offset of the section table is stored in the SizeOfheaders domain (SizeOfheaders is the structural member of Image_Optional_Header). 5. Handle each image_section_header structure. 6. Import Table (Import Table)
6.1, import function:
An import function is called by a module but is not in the caller module, thus named "Import". The import function is actually in one or more DLLs. Some function information is only retained in the caller module, including the function name and its resident DLL name.
Before the PE program is loaded into memory, the content stored in the PE file is used to determine the function position and patch them in order to complete the Image. After being loaded, .idata contains a pointer to the import function to the EXE / DLL.
6.2, Data Directory:
Data Directory is an image_data_directory structure architecture, a total of 16 members. Data Directory contains location and size information of the important data structures in the PE file. Each member contains information about an important data structure.
Every member of Data Directory is the type of image_data_directory, which is defined as follows:
TypedEf struct _image_data_directory {dword virt;} image_data_directory, * piMage_data_directory;
Image_data_directory Structure Members: 1.VirtualAddress: It is actually a relative virtual address (RVA) of the data structure. For example, if the structure is about Import Symbols, the domain contains RVA to the image_import_descriptor array.
2.Size: The number of bytes indicating the data structure of VirtualAddress.
6.3, find the general method of important data structures in the PE file:
1. Locate from DOS HEADER to PE Header. 2, read the address of Data Directory from Optional Header. 3, image_data_directory Size Multiplion Index Number: For example, if you want to find the location information of Import Symbols, you must multiplied 1 (8 Bytes) with image_data_directory structure (8 bytes). 4. Plus the above results with the Data Diectory address, we get the Image_Data_Directory structure item containing the query data structure information.
6.4, import table:
The VirtualAddress of the first item of the Data Directory array contains the import table address. The import table is actually an image_import_descriptor structure array. Each structure contains information of a associated DLL of the PE file import function. This array ends with a total of 0 members.
Image_import_descriptor consists of structure:
typedef struct _IMAGE_IMPORT_DESCRIPTOR {union {DWORD Characteristics; // 0 for terminating null import descriptor DWORD OriginalFirstThunk; // RVA to original unbound IAT (PIMAGE_THUNK_DATA)}; DWORD TimeDateStamp; // 0 if not bound, // -1 if bound, and real date / time stamp // in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND) // OW date / time stamp of DLL bound to (Old BIND) DWORD ForwarderChain; // -1 if no forwarders DWORD Name; DWORD FirstThunk; // RVA to IAT ( IF Bound this IAT HAS ACTUAL Addresses}}} image_import_descriptor;
Image_import_descriptor Structure Members: 1. Structure 1 is a UNION sub-structure. In fact, this UNION sub-structure only adds an individual name to OriginalFirstthunk, you can also call it "Characteristics". The member item contains RVAs that point to an array of image_thunk_data structures.
2.TimedateStamp: The moment of the program generated. This domain is usually 0. Microsoft's BIND program can write the generation of DLL corresponding to this image_import_descriptor to here.
3.Forwarderchain: This domain involves Forwarding, which means a DLL function calls another DLL. For example, in WinNT, kernel32.dll converts some of its output functions to NTDLL.DLL. The application may think it calls kernel32.dll, and the thing it is called Ntdll.dll. This domain contains an index and points to the firstthunk array. The function specified by this index is a transfer function.
3.Name: The RVA that points to the DLL name, that is, the pointer to the DLL name, is also an ASCII string.
4.Firstthunk: It is very similar to OriginalFirstthunk, which also includes RVA to an image_thunk_data structure array (of course this is another image_thunk_data structure).
Image_import_descriptor array, the most important part is the name of the Imported DLL and two image_thunk_data arrays. Each image_thunk_data corresponds to an import function. In EXE, two arrays (respectively by Characteristics and Firstthunk Domain) are parallel, and are ending with NULL bit.
Why do you need two parallel arrays? The first array (pointing by Characteristics) is never modified, sometimes it is called Hint-name Table. The second array (pointing by firstthunk) is rewritten by the loader. The loader checks each image_thunk_data and finds the address it recorded, and then writes the address into image_thunk_data this dword. Since this image_thunk_data array content has been overwritten as an address of the input function, it is called Import Address Table (IAT). IAT is a writable area. The API Hook is utilized to use this feature. After the PE loader is loaded, the image_thunk_data points to Firstthunk pointed to the image_thunk_data, and the image_thunk_data points to the Characteristics is not overwritten. So if you still find the import function name, the PE loader can find the function name according to the image_thunk_data pointed to by CHARACTERISTICS. 6.4, image_thunk_data: image_thunk_data is a collection of DWORD types. Usually we explain it to a pointer to an image_import_by_name structure. Note that image_thunk_data contains a pointer to an image_import_by_name structure instead of the structure itself.
IMAGE_THUNK_DATA structure definition: typedef struct _IMAGE_THUNK_DATA32 {union {PBYTE ForwarderString; PDWORD Function; DWORD Ordinal; PIMAGE_IMPORT_BY_NAME AddressOfData;} u1;} IMAGE_THUNK_DATA32;
Image_thunk_data is only determined after the PE is loaded. The Win32 loader uses image_thunk_data initial content (probably the function name may also be a function serial number) to find the position of the input function. Then override the content of Image_thunk_Data by the loader.
6.5, image_import_by_name:
Image_import_by_name structure definition: type_import_by_name {word hint; byte name [1];} image_import_by_name, * pimage_import_by_name;
1.Hint: Indicates the index number in the export table of this function in its resident DLL. This domain is used to quickly query functions in the PE loader to export tables in the DLL. This value is not necessary, some connectors set this value to 0.
2.NAME: With a function name of the import function. The function name is an ASCII string. Note that although the size of the NAME is defined as bytes, it is a variable size domain, but we don't have a better way to represent the variable size domain in the structure. This structure is provided for reviewing the structure of the description name.
Some functions are only exported by the order, which means that they cannot call them with a function name, which can only be called with their location. At this time, there is no image_import_by_name structure of the function in the caller module. Different, the low word indication function of the image_thunk_data value of the function should be function, and the highest binary (MSB) is set to 1. For example, if a function is only exported by the order and its order is 1234h, then the image_thunk_data value of the function should be 80001234h. Microsoft provides a convenient constant to test the MSB bit of the DWORD value, which is image_ordinal_flag32, which is 80000000H. 6.6 listing all import functions of a PE file:
1. Is the verification file is a valid PE. 2, position from DOS Header to Pe Header. 3. Get the address of the OptionalHeader data directory. 4. Go to the second member of the data directory to extract its VirtualAddress value. 5. Use the upper value to position the first image_import_descriptor structure. 6, check the originaAlFirstthunk value. If not 0, the RVA value in OriginalFirstthunk is transferred to the RVA array. If OriginalFirstthun is 0, you will change the firstthunk value. Some connections are set to set the originaAlFirstthunk value when generating a PE file, which should be a bug. However, for safety, we still check the originAlFirstthunk value first. 7. For each array element, we compare whether or not the element value is equal to Image_ordinal_Flag32. If the maximum binary of the element value is 1, the function is imported from the number of sequences, and the number of low byte will be extracted from the value. 8. If the maximum binary of the element value is 0, the value can be transferred to the image_import_by_name array as RVA, and the hint is the function name. 9. Skating to the next array element extraction function name until the bottom of the array (it ends with NULL). Now we have traveled to the import function of a DLL, and then processes the next DLL. 10. Jump to the next image_import_descriptor and processes, so this is cyclically circulated until the array is seen. (Image_import_descriptor array ends with a full 0 domain element).
6.7, Bound Import:
When the PE loader is loaded into the PE file, check the import table and map the associated DLLS to the process address space. Then IMAGE_THUNK_DATAS value is replaced with the IMAGE_THUNK_DATA array and use the real address of the import function. This step takes a lot of time. If the programmer can correct the function address in advance, the PE loader does not have to correct the image_thunk_datas value each time the PE file is loaded. Bound Import is the product of this idea. Microsoft's compiler with the Visual Studio compiler provides tools such as bind.exe, which inspects the import table of the PE file and replaced the imager_thunk_data value with the real address of the import function. When the file is loaded, the PE loader must check the validity of the address. If the DLL version is different from the related information stored in PE file, or DLLS needs to be relocated, the loader believes that the original calculated address is invalid, it must traverse OriginalFirstthunk pointed to an array to get the new address of the import function.
7. Export Table (Export Table)
When the PE loader performs a program, it puts the associated DLLs into the address space of the process. Then look up the real function address in the related dlls to correct the main program according to the import function information of the main program. The PE loader searches for export functions in DLLs. The PE program puts its export function related information in.
DLL / EXE To export a function to other DLL / EXEs, there are two implementation methods: exported by a function name or only by the number export. For example, a DLL wants to export a function named "getsysconfig", if it exports with a function name, other DLLS / EXES must call this function, you must pass the function name, it is getSysconfig. Another way is to export the order. The number is the only 16-digit number that specifies a function in the DLL, which is unique in the DLL pointed to. For example, in the previous example, the DLL can be derived by the number of sequences. It is assumed to be 16, then other DLLS / EXES must call this function to call parameters as getProcAddress. This is the so-called derived number export.
7.1 Export Table is the first member of the data directory, but it can be called image_export_directory. Structure definition: typedef struct _IMAGE_EXPORT_DIRECTORY {DWORD Characteristics; DWORD TimeDateStamp; WORD MajorVersion; WORD MinorVersion; DWORD Name; DWORD Base; DWORD NumberOfFunctions; DWORD NumberOfNames; DWORD AddressOfFunctions; // RVA from base of image DWORD AddressOfNames; // RVA from base of Image dword addressofnameordinals; // rva from base of image} image_export_directory, * pimage_export_directory;
Image_export_directory Structural Membership:
1.Characteristics: This domain is not used, always 0.
2.TimedateStamp: The moment that the program is generated.
3.majorversion / minorversion: No practical use, 0.
4.Name: A RVA value pointing to an ASCIIZ string (DLL name, such as mydll.dll). The real name of the module. The domain is required because the file name may change. In this case, the PE loader will use this internal name.
3.Base: The base, plus the number of orders is the index value of the function address array.
4.Numberoffunctions: The total number of functions / symbols exported by the module.
5.NumberOfNames: The number of functions / symbols exported through the name. This value is not the total number of functions / symbols exported by the module, which is given by NumberOffunctions above. The domain can be 0, indicating that the module may only be derived by the number. If the module does not export any function / symbols at all, the RVA of the table exported in the data directory is 0.
6.Addressoffunctions: There is a RVAS array pointing to all functions / symbols in the module. The domain is the RVA of the RVAS array. In short, all functions of all functions in the module are saved in an array, and the domain points to the first address of this array.
7.AddressofNames: Similar to the upper domain, there is a RVAS array pointing to all function names, the domain is the RVA of the RVAS array.
9.AddressofNameRinals: RVA, pointing to 16-bit arrays that contain the sequence of the relevant functions in the above AddressOfNames array.
The design of the derived table is to facilitate the work of the PE loader. First, the module must save all exported functions for the PE loader query. The module saves this information in the array pointing to the AddressOffunctions domain, and the number of array elements is stored in the NumberOffunctions domain. Therefore, if the module exports 40 functions, the array points to AddressOffunctions must have 40 elements, while the NumberOffunctions value is 40. Now if some functions are exported through the name, the module must also retain this information in the file. The RVAs of these names are stored in a population for the PE loader query. The array is directed by addressofNames, and NumberOfNames contains the number of names. Consider the working mechanism of the PE loader, it knows the function name and wants to get the address of these functions. So far, modules have two modules: name arrays and address arrays, but there is no contact between the two. So we also need some contact function names and its address of Dongdong. The PE reference indicates the index of the address array as a join, so the PE loader finds the matching name in the name array, it also acquires an index of the corresponding elements in the address table. These indexes are stored in another array (last) that is pointed to by the AddressOfNameRDinals domain. Since the array is a function of contacting the name and address, the number of elements must be the same as the name number, for example, there is only one related address, which is not necessarily: each address can have several names. Congratulations. So we give the same address "alias". In order to connect, the name array and index array must be used in parallel, for example, the first element of the index array must contain the index of the first name, and so on.
7.2 If we have export function names and want to get the address here, you can do this:
1, position to PE Header. 2, read the virtual address of the table from the data directory. 3, the positioning export table acquires the number of names (NumberOfnames). 4, parallel to traverse AddressOfNames and AddressOfNameRDINALS to match the array matching name. If you find a matching name in the array pointing to AddressOfNames, extract the index value from the array points to AddressOfNameRDINALS. For example, if the RVA of the matching name is found to store the 77th element of the AddressOfNames array, the 77 elements of the AddressOfNameRinal array are extracted as the index value. If you traverse the NumberOfNames elements, you have no name you want. 5. The value extracted from the AddressOfNameRDinals array as an index of the AddressOffunctions array. That is, if the value is 5, you must read the fifth element of the AddressOffunctions array, this value is RVA to be function.
7.3 Suppose we only have the order of the function, so how to get the function address, you can do this:
1, position to PE Header. 2, read the virtual address of the table from the data directory. 3, the positioning export table gets the NBase value. 4, minus the NBase value to point to the index of the Addressoffunctions array. 5. Compare this value with Numberoffunctions, which is greater than or equal to the latter. 6. You can get RVA in the AddressOffunctions array through the index above.