PE Tutorial 6: Import Table (Introduction Table)
Our class will learn to introduce the table. First warned, for readers who are not familiar with the introduction table, this is a long and difficult course, so I need to read more, it is best to open the debugger to analyze the related structure. Dear, hard work!
Download example.
theory:
First, you have to know what is introduced into a function. An introduction function is called by a module but is not in the caller module, thus named "import (introduction)". The introduction function is actually in one or more DLLs. Some function information is only retained in the caller module, including the function name and its resident DLL name. Now, how can we find the information saved in the PE file? Go to Data Directory to seek the answer. Review a one more, the following is PE Header:
Image_nt_headers STRUCT SIGNATURE DD? FileHeader Image_File_Header <> OptionalHeader Image_Optional_Header <> iMage_NT_HEADERS Ends
The last member of Optional Header is Data Directory (data directory):
Image_optional_header32 structure .... loaderflags dd? Numberofrvaandsizes DD? DataDirectory image_data_directory 16 DUP (<>) image_optional_header32 Ends
Data Directory is an image_data_directory structure architecture, a total of 16 members. If you still remember that the section table can be seen as the root directory of the PE file, it can also be considered that Data Directory is the root directory of logic elements stored in these sections. Clearly, Data Directory contains the location and size information of the important data structures in the PE file. Each member contains information about an important data structure.
Member Info inside0 Export symbols1 Import symbols2 Resources3 Exception4 Security5 Base relocation6 Debug7 Copyright string8 Unknown9 Thread local storage (TLS) 10 Load configuration11 Bound Import12 Import Address Table13 Delay Import14 COM descriptor
The above golden display is what I am familiar. Understanding the Data Directory contains the domain, we can take care of them carefully. Every member of Data Directory is the type of image_data_directory, which is defined as follows:
Image_data_directory STRUCT VIRTUALADDRESS DD? ISIZE DD? Image_Data_directory Ends
VirtualAddress is actually a relative virtual address (RVA) of the data structure. For example, if the structure is about Import Symbols, the domain contains RVA to the image_import_descriptor array. Isize contains the number of bytes indicated by VirtualAddress.
Here's how to find a general method of important data structures in the PE file:
Locate from DOS Header to PE Header to read the address of Data Directory from Optional HEADER. Image_data_directory Size Multiplion Index Number: For example, if you want to find the location information of Import Symbols, you must multiplied 1 (8 Bytes) with image_data_directory structure (8 bytes). To add the result of the above, we get the image_data_directory structure item containing the data structure information of the query data structure. Now let's start truly discussing the introduction table. The Data Directory array second VirtualAddress contains the introduction table address. The introduction table is actually an image_import_descriptor structure array. Each structure contains information of a related DLL of the PE file introduced into a function. For example, if the PE file introduces a function from 10 different DLLs, then this array has 10 members. This array ends with a total of 0 members. The following detailed research structure composition:
Image_import_descriptor STRUCT UNION CHARACTERISTICS DD? ORIGINALFIRSTHUNK DD? ENDS TIMEDATESTAMP DD? FORWARDERCHAIN DD? NAME1 DD? FIRSTHUNK DD? Image_Import_Descriptor Ends
The structure first item is a UNION sub-structure. In fact, this UNION sub-structure only adds an individual name to OriginalFirstthunk, you can also call it "Characteristics". The member item contains RVAs that point to an array of image_thunk_data structures. What is image_thunk_data? This is a collection of DWORD types. Usually we explain it to a pointer to an image_import_by_name structure. Note that image_thunk_data contains pointers to a configuration of an image_import_by_name: rather than the structure itself. Please see here: There are several image_import_by_name structures, we collect the RVA (image_thunk_datas) of these structures to form an array and end with 0, and then add RVA of the array into OriginalFirstthunk. This image_import_by_name structure has an introduction of a function. Let's study the image_import_by_name structure what is like?
Image_import_by_name struct hint dw? Name1 dB? Image_import_by_name ends
Hint indicates the index number in the extraction table of this function in its resident DLL. This domain is used to quickly query functions in the PE loader in DLL. This value is not necessary, some connectors set this value to 0. Name1 contains a function name that introduces a function. The function name is an ASCIIZ string. Note that although the size of Name1 is defined herein, it is a variable size domain, but we have no better way to represent the variable size domain in the structure. The Structure Is Provided So That You Can Refer to The Data Structure with descriptive names.
TIMEDASTAMP and FORWARDERCHAIN are advanced State: Let's discuss them after proficient in other members.
Name1 contains RVAs that point to the DLL name, pointing to a pointer to the DLL name, is also an ASCIIZ string.
Firstthunk is very similar to OriginalFirstthunk, which also includes RVA to an image_thunk_data structure array (of course this is another image_thunk_data structure). Ok, if you are still embarrassing, you will see this: Now there are several image_import_by_name structures, and you have created two structural arrays, and the same inch into the RVAs of those image_import_by_name structures, so that two arrays Contains the same value (which can be described as accurate replication). Finally, you decided to assign the first array of RVA to OriginalFirstthunk, the second array of RVA assigned to Firstthunk, so everything is very clear. OriginalFirstthunk image_import_by_name firstthunk | |
Image_thunk_data image_thunk_data image_thunk_data image_thunk_data ... image_thunk_data
---> ---> ---> ---> ---> --->
Function 1 Function 2 Function 3 function 4 ... function n
<--- <--- <--- <--- <--- <---
Image_thunk_data image_thunk_data image_thunk_data image_thunk_data ... image_thunk_data
Now you should understand what I mean. Don't be confused by image_thunk_data: it is just RVA to the image_import_by_name structure. If you think of image_thunk_data, you think of RVA, it is easier to understand. These two arrays pointed to by ORIGINALFIRSTTHUNK and FIRSTTHUNK depends on the number of functions introduced from the PE file from the DLL. For example, if the PE file introduces 10 functions from the kernel32.dll, the Name1 domain of the image_import_descriptor structure contains RVA to "kernel32.dll", while each image_thunk_data array has 10 elements.
The next question is: Why do we need two identical arrays? In order to answer the problem, we need to understand that when the PE file is loaded into memory, the PE loader looks for an array of image_thunk_data and image_import_by_name to decide to introduce functions. address. Then use the introduction function real address to replace the element value value in the image_thunk_data array pointed by Firstthunk. Therefore, when the PE file is ready to execute, the above figure has been converted to:
OriginalFirstthunk image_import_by_name firstthunk | |
Image_thunk_data image_thunk_data image_thunk_data image_thunk_data ... image_thunk_data
---> ---> ---> ---> ---> --->
Function 1 Function 2 Function 3 function 4 ... function n
Address of Function 1 Address of Function 2 Address of Function 3 Address of Function 4 ... Address of Function N
The RVA array pointed to by OriginalFirstthun will never change, so if you still find the introduction function name, the PE loader can find it. Of course, simple things have their own complex side. Some functions are only ranked only by the number of orders, that is, you can't call them with a function name: You can only call them in their position. At this time, there is no image_import_by_name structure of the function in the caller module. Different, the low word indication function of the image_thunk_data value of the function should be function, and the highest binary (MSB) is set to 1. For example, if a function is only ranked by the number of orders and its order is 1234h, then the image_thunk_data value of the function should be 80001234h. Microsoft provides a convenient constant to test the MSB bit of the DWORD value, which is image_ordinal_flag32, which is 80000000H. Suppose we have to list all the introduction functions of a PE file, you can walk by the following steps: check if the verification file is a valid PE. Located from DOS HEADER to PE Header. Get the address of the OptionalHeader data directory. The second member transferred to the data directory extracts its VirtualAddress value. Locate the first image_import_descriptor structure with the upper value. Check the originaAlFirstthunk value. If not 0, the RVA value in OriginalFirstthunk is transferred to the RVA array. If OriginalFirstthun is 0, you will change the firstthunk value. Some connections are set to set the originaAlFirstthunk value when generating a PE file, which should be a bug. However, for safety, we still check the originAlFirstthunk value first. For each array element, we compare if the element value is equal to Image_ordinal_Flag32. If the maximum binary of the element value is 1, the function is introduced by the order, and the number of orders can be extracted from the value of the value. If the maximum binary of the element value is 0, the value can be transferred to the image_import_by_name array as RVA, and hop Hint is the function name. Skate to the next array element extract function name until the bottom of the array (it ends in null). Now we have traveled to the introduction function of a DLL, and then processes the next DLL. That is, jump to the next image_import_descriptor and processes, so this is cycled until the array is seen. (Image_import_descriptor array ends with a full 0 domain element).
Example:
This routine opens a PE file, read all import function names into an edit control, and displays the image_import_descriptor structural value.
.386 .model flat, stdcall option casemap: none include /masm32/include/windows.inc include /masm32/include/kernel32.inc include /masm32/include/comdlg32.inc include /masm32/include/user32.inc includelib / masm32 /lib/user32.lib includelib /masm32/lib/kernel32.lib includelib /masm32/lib/comdlg32.lib IDD_MAINDLG equ 101 IDC_EDIT equ 1000 IDM_OPEN equ 40001 IDM_EXIT equ 40003 DlgProc proto: DWORD,: DWORD,: DWORD,: DWORD ShowImportFunctions proto: DWORD ShowTheFunctions proto: DWORD,: DWORD AppendText proto: DWORD,: DWORD SEH struct PrevLink dd; the address of the previous seh structure CurrentHandler dd; the address of the new exception handler SafeOffset dd; The offset where it's safe??? to continue execution PrevEsp dd;? the old value in esp PrevEbp dd;? The old value in ebp sEH ends .data AppName db "PE tutorial no.6", 0 ofn OPENFILENAME <> FilterString db "Executable Files (* .exe, * .dll) ", 0," *. EXE; *. DLL ", 0 DB" All Files ", 0," *. * ", 0, 0 FileOpener DB" Cannot Open THE FILE Forread ", 0 FileOpenMappinger DB" Cannot Open THE FILE for MAMORY MAPPING ", 0 FileMappinger DB" Cannot Map The File Into Memory ", 0 Notvalidpe DB" this File Is Not a Valid PE ", 0 CRLF DB 0DH, 0AH, 0 ImportDescriptor db 0dh, 0ah, "================ [image_import_descriptor] ===============================================================00 0DH, 0AH DB "TIMEDATESTAMP =% LX", 0DH, 0AH DB "ForwarderChain =% LX", 0DH, 0AH DB "Name =% S", 0DH, 0AH DB "
Firstthunk =% lx ", 0 nameheader db 0dh, 0ah," hint function ", 0DH, 0AH DB" --------------------------- ------------ ", 0 NameTemplate DB"% U% S ", 0 OrdinalTemplate DB"% u (ORD.) ", 0 .DATA? BUFFER DB 512 DUP (?) HFile ? dd hMapping dd pMapping dd ValidPE dd .code start:??? invoke GetModuleHandle, NULL invoke DialogBoxParam, eax, IDD_MAINDLG, NULL, addr DlgProc, 0 invoke ExitProcess, 0 DlgProc proc hDlg: DWORD, uMsg: DWORD, wParam: DWORD, lParam: DWORD .if uMsg == WM_INITDIALOG invoke SendDlgItemMessage, hDlg, IDC_EDIT, EM_SETLIMITTEXT, 0,0 .elseif uMsg == WM_CLOSE invoke EndDialog, hDlg, 0 .elseif uMsg == WM_COMMAND .if lParam == 0 mov eax, wParam. if ax == IDM_OPEN invoke ShowImportFunctions, hDlg .else; IDM_EXIT invoke SendMessage, hDlg, WM_CLOSE, 0,0 .endif .endif .else mov eax, FALSE ret .endif mov eax, TRUE ret DlgProc endp SEHHandler proc uses edx pExcept: DWORD , Pframe: DWORD, PCONTEXT: DWORD, PDISPATCH: DWORD MOV EDX, PFRAME Assume Edx: Ptr Seh Mov Eax, PContext assume eax: ptr CONTEXT push [edx] .SafeOffset pop [eax] .regEip push [edx] .PrevEsp pop [eax] .regEsp push [edx] .PrevEbp pop [eax] .regEbp mov ValidPE, FALSE mov eax, ExceptionContinueExecution ret SEHHandler endp ShowImportFunctions proc uses edi hDlg: DWORD LOCAL seh: SEH mov ofn.lStructSize, SIZEOF ofn mov ofn.lpstrFilter, OFFSET FilterString mov ofn.lpstrFile, OFFSET buffer mov ofn.nMaxFile, 512 mov ofn.Flags, OFN_FILEMUSTEXIST or OFN_PATHMUSTEXIST or OFN_LONGNAMES OR OFN_EXPLORER OR OFN_HIDEREADONLY Invoke GetopenFileName, Addr OFN .IF EAX ==
TRUE invoke CreateFile, addr buffer, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL .if eax! = INVALID_HANDLE_VALUE mov hFile, eax invoke CreateFileMapping, hFile, NULL, PAGE_READONLY, 0,0,0 .if eax! = NULL mov hMapping , eax invoke MapViewOfFile, hMapping, FILE_MAP_READ, 0,0,0 .if eax = NULL mov pMapping, eax assume fs:! nothing push fs: [0] pop seh.PrevLink mov seh.CurrentHandler, offset SEHHandler mov seh.SafeOffset, offset FinalExit lea eax, seh mov fs: [0], eax mov seh.PrevEsp, esp mov seh.PrevEbp, ebp mov edi, pMapping assume edi: ptr IMAGE_DOS_HEADER .if [edi] .e_magic == IMAGE_DOS_SIGNATURE add edi, [edi ] .e_lfaNew Assume EDI: PTR Image_NT_HEADERS .IF [EDI]. Signature == Image_NT_SIGNATURE MOV VALIDPE, TRUE .ELSE MOV VALID PE, FALSE .endif .else mov ValidPE, FALSE .endif FinalExit: push seh.PrevLink pop fs: [0] .if ValidPE == TRUE invoke ShowTheFunctions, hDlg, edi .else invoke MessageBox, 0, addr NotValidPE, addr AppName, MB_OK MB_ICONERROR .endif invoke UnmapViewOfFile, pMapping .else invoke MessageBox, 0, addr FileMappingError, addr AppName, MB_OK MB_ICONERROR .endif invoke CloseHandle, hMapping .else invoke MessageBox, 0, addr FileOpenMappingError, addr AppName, MB_OK
MB_ICONERROR .endif invoke CloseHandle, hFile .else invoke MessageBox, 0, addr FileOpenError, addr AppName, MB_OK MB_ICONERROR .endif .endif ret ShowImportFunctions endp AppendText proc hDlg: DWORD, pText: DWORD invoke SendDlgItemMessage, hDlg, IDC_EDIT, EM_REPLACESEL, 0, pText invoke SendDlgItemMessage, hDlg, IDC_EDIT, EM_REPLACESEL, 0, addr CRLF invoke SendDlgItemMessage, hDlg, IDC_EDIT, EM_SETSEL, -1,0 ret AppendText endp RVAToOffset PROC uses edi esi edx ecx pFileMap: DWORD, RVA: DWORD mov esi, pFileMap assume esi : ptr IMAGE_DOS_HEADER add esi, [esi] .e_lfanew assume esi: ptr IMAGE_NT_HEADERS mov edi, RVA; edi == RVA mov edx, esi add edx, sizeof IMAGE_NT_HEADERS mov cx, [esi] .FileHeader.NumberOfSections movzx ecx, cx assume edx : ptr IMAGE_SECTION_HEADER .while ecx> 0; check all sections .if edi> = [edx] .VirtualAddress mov eax, [edx] .VirtualAddress add eax, [edx] .SizeOfRawData .if edi file offset ret .endif .endif add edx, sizeof IMAGE_SECTION_HEADER dec ecx .endw assume edx: nothing assume esi: nothing mov eax, edi ret RVAToOffset endp ShowTheFunctions proc uses esi ecx ebx hDlg: DWORD, pNTHdr: DWORD LOCAL temp [512]: BYTE invoke SetDlgItemText, hDlg, IDC_EDIT, 0 invoke AppendText, hDlg, addr buffer mov edi, pNTHdr assume edi: ptr IMAGE_NT_HEADERS mov edi, [edi] .OptionalHeader.DataDirectory [sizeof IMAGE_DATA_DIRECTORY] .VirtualAddress invoke RVAToOffset, pMapping, edi mov edi, eax add edi, pMapping assume edi:! ptr IMAGE_IMPORT_DESCRIPTOR .while ([edi] .OriginalFirstThunk == 0 && [edi] .TimeDateStamp == 0 && [edi] .ForwarderChain == 0 && [edi] .Name1 == 0 && [edi] .FirstThunk == 0) invoke AppendText, hDlg, addr ImportDescriptor invoke RVAToOffset, pMapping, [edi] .Name1 mov edx, eax add edx, pMapping invoke wsprintf, addr temp, addr IDTemplate, [edi] .OriginalFirstThunk, [ EDI] .timedatestamp, [EDI] .forwarder Chain, edx, [edi] .FirstThunk invoke AppendText, hDlg, addr temp .if [edi] .OriginalFirstThunk == 0 mov esi, [edi] .FirstThunk .else mov esi, [edi] .OriginalFirstThunk .endif invoke RVAToOffset, pMapping ESI Add Eax, Pmapping Mov ESI, EAX Invoke AppendText, HDLG, AddR NameHeader .While Dword Ptr [ESI]! = 0 test dword ptr [esi], IMAGE_ORDINAL_FLAG32 jnz ImportByOrdinal invoke RVAToOffset, pMapping, dword ptr [esi] mov edx, eax add edx, pMapping assume edx: ptr IMAGE_IMPORT_BY_NAME mov cx, [edx] .Hint movzx ecx, cx invoke wsprintf, addr temp, addr NameTemplate, ecx, addr [edx] .Name1 jmp ShowTheText ImportByOrdinal: mov edx, dword ptr [esi] and edx, 0FFFFh invoke wsprintf, addr temp, addr OrdinalTemplate, edx ShowTheText: invoke AppendText, hDlg, addr temp add esi , 4.ndw add edi, sizeof image_import_descriptor .endw return analysis: In this example, the user clicks to open the menu display file open dialog, and then call ShowThefunctions after the PE validity of the file is verified. Showthefunctions Proc Uses ESI ECX EBX HDLG: DWORD, PNTHDR: DWORD LOCAL TEMP [512]: BYTE Retain 512 byte stack space for string operations. Invoke setdlgitemtext, hdlg, idc_edit, 0 Clear the editing control content. Invoke AppendText, HDLG, AddR Buffer Insert the PE file name into the editing control. AppendText adds text to the Edit control by passing an EM_REPLACESEL message to the Edit control. It then sends an EM_SETSEL message that sets wparam = -1 and lparam = 0 to the editing control, and positions the cursor to the end of the text. MOV EDI, PNTHDR Assume EDI: Ptr Image_NT_Headers Mov Edi, [EDI] .OptionalHeader.DataDirectory] .VirtualAddress Get the RVA of IMPORT SYMBOLS. EDI starts to PE HEADER, in this way we can locate the second array element of the data directory array to get the virtual address value. Invoke Ratooffset, Pmapping, EDI MOV EDI, EAX Add Edi, PMApping It may be a bit difficult here for PE programming beginners. Most of the addresses in the PE file are mostly RVAS and RVAS is only meaningful when the PE file is loaded by the PE loader. In this example, we directly map files to memory instead of loading through the PE loader, so we cannot use those RVAs directly. Those RVAs must be converted into file offset, and the rvatooffset function has played. Not prepared for detailed analysis here. It is pointed out that it also compares the starting RVA of the given RVA and PE files (check the validity of RVA), and then pass the PointertorawData field in the image_section_header structure (of course, the PointertorawData field of the country) will RVA. Convert into file offset. Function Use requires two parameters: memory map pointer and RVA to be converted. The file offset is returned in EAX. In the above code, we must add the file offset plus the memory map file pointer to the virtual address. Is it a bit complicated?:): Assume EDI: Ptr image_import_descriptor .While! ([EDI] .originalfirstthunk == 0 && [EDI] .timedatestamp == 0 && [EDI] .forwarderchain == 0 && [EDI] .name1 == 0 && [EDI] .firstthunk == 0) EDI now points to the first image_import_descriptor structure. Next we travers the entire structure of the structure until a full 0 structure, this is the end of the array. Invoke appendtext, hdlg, addr importdescriptor invoke rvatooffset, pmapping, [edi] .name1 MOV EDX, EAX Add Edx, PMApping We want to display the value of the current image_import_descriptor structure. Name1 is different from other structural members, which contains RVAs that point to related DLL names. Therefore, it must first be converted into a virtual address. Invoke WSPrintf, Addr Temp, Addr IDTemplate, [EDI] .originalFirstthunk, [EDI] .timedatestamp, [EDI] .forwarderchain, EDX, [EDI] .firstthunk Invoke AppendText, HDLG, Addr Temp Displays the value of the current image_import_descriptor structure. . IF [EDI] .originalfirstthunk == 0 MOV ESI, [EDI] .firstthunk .else Mov ESI, [EDI] .originalFirstthunk .endif Next, I will prepare the image_thunk_data array. Usually we will choose the array pointing to ORIGINALFIRSTTHUNK, but if some connectors are incorrectly set to 0, this can be judged by checking whether the originaAlFirstthunk value is 0. In this case, just select the array pointed to the firstthunk. Invoke Rvatooffset, Pmapping, ESI Add Eax, Pmapping Mov ESI, ESI Similarly, the originaAlFirstthunk / firstthunk value is a RVA. It must be converted to a virtual address. Invoke appendtext, hdlg, addr namehead .While Dword PTR [ESI]! = 0 Now we are going to traverse the image_thunk_datas array to find the function name introduced by the DLL until you meet all 0 items. Test DWORD PTR [ESI], image_ordinal_flag32 jnz importbyordinal The first thing is whether IMAGE_THUNK_DATA contains image_ordinal_flag32 tags. Check if image_thunk_data's MSB is 1, if it is 1, the function is to be drawn by the number, so it is not necessary to process it. Directly extract the low byte from image_thunk_data, then the next image_thunk_data double word. Invoke Ratooffset, PMapping, DWORD PTR [ESI] MOV EDX, Eax Add Edx, PMApping Assume Edx: Ptr Image_Import_by_name If image_thunk_data's MSB is 0, it contains the RVA of the image_import_by_name structure. Need to convert to virtual addresses first. MOV CX, [EDX]. Hint Movzx ECX, CX Invoke Wsprintf, Addr Temp, Addr NameTemplate, ECX, Addr [EDX] .Name1 JMP ShowThetExt Hint is a word type, so then the transfer is then transferred to WSPRINTF and then we will display the Hint and function names in the editing control. Importbyordinal: Mov Edx, DWORD PTR [ESI] and Edx, 0FFFFH INVOKE WSPRINTF, AddR Temp, Addr OrdinalTemplate, EDX In the case of only the number of orders, first empty high words and then display the number. ShowthetxT: Invoke AppendText, HDLG, Addr Temp Add ESI, 4 After inserting the corresponding function name / order in the editing control, jump to the next image_thunk_data. .Endw add edi, sizeof image_import_descriptor Processing all the two words in the current image_thunk_data array, jump to the next image_import_descriptor to start processing other DLLs's introduction functions. appendix: Let's discuss Bound Import. When the PE loader is loaded into the PE file, check the introduction table and map the associated DLLS to the process address space. Then Idle_thunk_datas value is replaced with the IMAGE_THUNK_DATA array and use the real address of the introduction function. This step takes a lot of time. If the programmer can correct the function address in advance, the PE loader does not have to correct the image_thunk_datas value each time the PE file is loaded. Bound Import is the product of this idea. For convenience, Microsoft's compiler with Visual Studio provides a tool such as bind.exe, which checks the introduction table of the PE file and replaced the image_thunk_data value with the real address of the introduction function. When the file is loaded, the PE loader must check the validity of the address. If the DLL version is different from the related information stored in PE file, or DLLS needs to be relocated, the loader believes that the original calculated address is invalid, it must traverse OriginalFirstthunk pointed to an incoming new address. Bound Import is not very important in this lesson, and we are safe to use ORIGINALFIRSTHUNK. For more information, see luevelsmeyer's pe.txt.