JIURL PE study summary format (three) - input function of the PE file: JIURL Home: http://jiurl.yeah.net/ Date: 2003-4-24
With regard to the input section, we will detail the various structures of the input function, and the input function and its related structure are placed in the PE file by one example. And how to find these things in the PE file. An input section is found in the file. 1.1 Get the location of Pe Header in the file. You can determine the location of the PE Header in the file through the member E_LFANew of the DOS HEADER structure. 1.2 Get the number of documents. Determine the location of the PE Header in the file, you can determine the position of the members FileHeader and member OptionalHeader in the PE header in the file. Depending on the value of member Numberofsections in FileHead, the number of files world segments can be determined, that is, the number of elements in the chart array. 1.3 Get the location of the section table in the file. PE Header can get the start position of the section table in the file in the location of the PE Header structure in the location of the file. The size of the PE Header structure can be determined by the size of Signature plus the size of the FileHeader plus the sizeOfoptionalHeade in the FileHeader. In fact, SIZEOFOPTIONALHEADE is also the size of the Optional HEADER, which is also fixed, so the size of the entire PE Header structure is also fixed. However, in order to ensure the size of the FileHeader with the size of Signature, add the SizeOfOptionalHeade in FileHeader to determine comparison insurance. 1.4 Get the location of the input portion in the file. In step 1.2, we identified the number of files in the file, and we determined the position of the section table in the file in step 1.3. Now determine where the input portion is in the file. Take the second item in the DataDirectory array in the Optional HEADER in Pe Header, that is, enter some items. Each of the DataDirectory [] array is an Image_Data_directory structure, which is defined as follows. Typedef struct _image_data_directory; dword size;} image_data_directory, * pimage_data_directory; obtains the value of member VirtualAddress in the second item in the DataDirectory array. This value is RVA in the resource section in memory. If the value of this RVA is 0 indicates that there is no input in this PE file. Then, according to the number of the festival, traversal table array. That is, every section item from 0 to (the number of tables - 1). Each section of the RVA in memory is starting from the value of the member VirtualAddress field of the section entry (including this value), and the value of VirtualAddress Misc.virtualsize ends (excluding this value). We traverse the entire section, see the RVA of the input part we acquire, which section of the RVA range of the RVA range. If you are within the scope, you find the section item you are in the section you have. The value in the PointertorawData in this section is the location where the input section is in the file. The value in the VirtualAddress in this section is the RVA in the memory where the input section is located. The RVA in the input portion is subtracted by the RVA of the input portion, and the input portion can be obtained in this section. With this offset, add the position in the file in the file, you can get the location of the input portion in the file. That is, DATADIRECTORY [Image_Directory_entry_import] .virtualAddress - SectionTable [i] .virtualAddress SectionTableTable [i] .pointertorawData. This allows us to get the input portion to start in the file. The input portion in the second PE file.
Input section, if you want to call the output function in the PE file, what is the things? First, you need to know which file you want, such as the function NTRAISEHARDERROR in the PE file NTDLL.DLL. So we need a file name. How to find a function of the entrance address, we also need to know the function name of the function, or change the serial number of the function. Through any of these two, we can find the entrance address of the function (if you don't know why See the JIURL PE format learning summary (2) - the output function in the PE file). So we also need a function name or serial number, one of both. The input section of the PE file has these content. We can also think that when a PE file is executed, it will load each file in the input function to load the memory, and obtain the entry address of each input function according to the function name or serial number, stored, Use when the program is executed. Also, an executable usually uses several PE files (usually DLLs) output functions. So there is a need to have multiple DLLs (just as a DLL, providing the PE file of the output function is almost the DLL, "below the DLL). We have already got the input portion where the input section started in the file, the beginning of the input section, is an image_import_descriptor structure array, the last element of this array is full, indicating the end of this array, each of this array Elements, saved a DLL related information. Close to this image_import_descriptor array is a few next to DWORD arrays, each element of the array stores the RVA of the function name string, or directly saves the serial number, and the last item of each array is empty, the indicator ends. After these arrays, it is followed by the string of the DLL name and each input function name structure. Image_import_descriptor structure is defined in Winnt.H as follows. typedef struct _IMAGE_IMPORT_DESCRIPTOR {union {DWORD Characteristics; // 0 for terminating null import descriptorDWORD OriginalFirstThunk; // RVA to original unbound IAT (PIMAGE_THUNK_DATA)}; DWORD TimeDateStamp; // 0 if not bound, // -1 if bound, and real date / time stamp // in IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT (new BIND) // OW date / time stamp of DLL bound to (Old BIND) DWORD ForwarderChain; // -1 if no forwardersDWORD Name; DWORD FirstThunk; // RVA to IAT (if bound Image_import_descriptor; This structure is 20 bytes, a total of 5 fields. The meaning of each field is as follows: OriginalFirstthunk: (CHARACTERISTICS in Winnt.h is not right) This actually saves a RVA, which points to a DWORD array, which can be called an input query table.
Each array element, or a group, saves a RVA pointing to the function name or saves a serial number of a function. TIMEDATESTAMP: When this value is 0, it indicates that there is no bind. If it is not 0, it means that it has been bind. Related to the content of Bind later. Forwarderchain: Name: A RVA, this RVA pointing to a string ended with empty characters, this string is the name of the DLL file corresponding to this structure. Firstthunk: A RVA, this RVA points to a DWORD array, this array can be called an input address table. If BIND, each element of this array is an entry address of an input function. Enter the query table, which is the DWORD array pointed to by OriginalFirstthunk. Each element is a DWORD value. When the highest bit is 1, the value in the lower 31 bits is a serial number. When the highest bit is 0, the value of this element is a RVA pointing to an input function name structure. The last element value of this array is empty, indicating the end of the array. Enter the function name structure, defined in Winnt.h as follows. TYPEDEF STRUCT _IMAGE_IMPORT_BY_NAME {Word Hint; Byte Name [1];} Image_import_by_name, * pimage_import_by_name; This structure is not unclear, there are two members. The first member is a Word type, a length of 2 bytes, saving the serial number of the input function. The second member is an ASCII string that is the name of the input function. In order to ensure that the word alignment may be filled with one / 0 after the ASCII ender / 0. For example, 1B 01 4E 74 54 65 72 6D 69 6E 61 74 65 50 72 6F 63 65 73 73 00 00, if the last 00 is not filled, the length is 21 bytes, not the word alignment. So you have to fill a 00. Enter the address table, which is the DWORD array pointed to by Firstthunk, and each element is a DWORD value. If the program has already bind, (the judgment is timeDASTAMP, TIMEDATESTAMP is 0 no bind) The value of each element here is an entry address of an input function. If there is no bind, then when the PE file is executed, the loader will load the DLL file to get the entry address of each input function and fill in each of the input address table. (These are what I guess, everyone, I wish me to guess it) The last element value of this array is empty, indicating the end of the array. Bind, can be seen from the above introduction, if there is no bind, each time the PE file is executed, the loader must query the entrance address of each function, so in order to optimize this, it has bind, put it. The entry point directly exists in the input address table. The loader will load the required DLL. Note that there is no bind, the loader to do what you want to do. In short, after loading, the required DLL (according to the file name) has been loaded into memory. And each element in the input address table is an entry address of an input function. Let's take an example, you can understand what is going on through an example. Our example is the Exe files in Win2k CSRSS.exe. In order to prevent the version from being different, this PE file is included in this article. Different members of each structure are used / separated. Each line is a structure. You can open the included RouteTab.dll with a 16-binding editor. The content is comment in parentheses.
With the beginning of the way to find the input section in the file, we found the location of the input section in the file to 000008DCH. Let's calculate the OriginalFirstthunk, Name, Firstthunk in the first image_import_descriptor. The start RVA in the input section (obtained by DataDirectory [2]) is 1000h. The input portion is 600h in the section in the file. Name is RVA (value can be seen from the structure, if you don't understand why 0000135E instead of 5E130000, please see "JIURL PE format learning summary (1) About BIG-Endian and Little-Endian introduction) The offset at Name is 135e-1000 with respect to the offset starting at the beginning. The position of Name in the file is the position in the file in the file with the offset starting with respect to the step. So the position in the file is 135EH-1000H 600h = 95EH. The same method we can calculate, OriginalFirstthunk: 1318-1000 600 = 918. FIRSTTHUNK: 1000-1000 600 = 600. 000008dc: 18 13 00 00 / FF FF FF / FF FF FF / 5E 13 00 00/00 10 00 00 (Structure Image_import_Descriptor, each represents a DLL. You can see two image_import_descriptor, so the input function of this PE file It is provided by two DLLs. The third is all empty, indicates the end.
000008F0: 20 13 00 00 / FF FF FF / FF FF FF FF / C2 13 00 00/08 10 00 00 (Structure Image_Import_Descriptor) 0000904: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 (all empty, indicating end image_import_descriptor array end) 00000918: 44 13 00 00 (Address in file 1344-1000 600 = 944, pointing to an input function name) 0000091C: 00 00 00 00 00 (为 空, an input query table end) 00000920: 84 13 00 00 (Add the address in the file 1384-1000 600 = 984, pointing to an input function name) 00000924: 98 13 00 00 (1398-1000 600 = 998) 00000928: 6A 13 00 00 (136A-1000 600 = 96A) 0000092C: AE 13 00 00 (13AE-1000 600 = 9AE) 00000930: CC 13 00 00 (13cc-1000 600 = 9cc) 00000934: DC 13 00 00 (13DC-1000 600 = 9dc) 00000938: EE 13 00 00 (13EE-1000 600 = 9EE) 0000093C: 0E 14 00 00 (140E-1000 600 = A0E) 0000000940: 00 00 00 00 00 00 00 00 00 00 00 It is empty, an input query table ends) 00000944: 18 00/43 73 72 53 65 72 76 65 72 49 6E 69 74 69 61 6C 69 7A 61 74 69 6F 6E 00 (input function name structure image_import_by_name hint is 18 name " CSRServerInitialization. ") 0000095E: 43 53 52 53 52 56 2E 64 6C 6C 00 00 00 (First IMAGE_IMPORT_DESCRIPTOR NAME point" csrsrv.dll "0000096A: 00 01 / 4E 74 53 65 74 49 6E 666F 72 6D 61 74 69 6F 6E 50 72 6F 63 65 73 73 00 ("NtsetInformationProcess.")