This is some summary when I learned PE. Analysis Summary: (* 1 *): Write the program. A.CPP and FOO.cPP where A.CPP content is:
EXTERN VOID FOO (); void main () {foo ();
The content of foo.cpp is:
#include "stdio.h" void foo () {Printf ("I am foo!");} Compiler generates A.Obj foo.obj a.exe.
(* 2 *): COPY above 3 files to ../visualstdio/vc98/bin directory with the following command analysis: dumpbin / all a.obj> AOBJ.TXT
(* 3 *): Open the file a.obj and find the code section. As follows: SECTION HEADER # 3 .text name 0 physical address 0 virtual address 2E size of raw data 355 file pointer to raw data // Attention !! ~ ~ 383 file pointer to relocation table 397 file pointer to line numbers 2 number of! Relocations 3 Number of line NumBers60501020 Flags Code Communal; SYM = _Main 16 Byte Align Execute Read
RAW DATA # 3 00000000: 55 8B EC 83 EC 40 53 56 57 8D 7D C0 B9 10 00 00 U .... @ SVW} ..... 00000010:. 00 B8 CC CC CC CC F3 AB E8 00 00 00 00 5f 5e 5b ............._ ^ [0000000020: 83 C4 40 3B EC E8 00 00 00 8b E5 5D C3 .. @; ........]. (* 4 *): Refine the code section of A.Obj. Open Tools Ursoft W32DASM (I use Version 8.93) Select all files when opening the file, because this software is mainly for file formats such as PE, Le, NE. So for the user to specify the offset for the anti-compilation OBJ file. As shown atTTENTION! That is, the file offset of the code section. So in the Tips dialog box to open the OBJ file: 00000355Start Disassembly from offset 00000355 HEX. No need to select Check for 16 Bit Disassembly. The code section after the anti-compilation is as follows:
: 00000000 55 push ebp: 00000001 8BEC mov ebp, esp: 00000003 83EC40 sub esp, 00000040: 00000006 53 push ebx: 00000007 56 push esi: 00000008 57 push edi: 00000009 8D7DC0 lea edi, dword ptr [ebp-40]: 0000000C B910000000 mov ecx, 00000010: 00000011 B8CCCCCCCC mov eax, CCCCCCCC: 00000016 F3 repz: 00000017 AB stosd: 00000018 E800000000 call 0000001D // Attention !!!: 0000001D 5F pop edi: 0000001E 5E pop esi: 0000001F 5B pop ebx: 00000020 83C440 add esp 0000000000002: 0000025 E800000000 Call 0000002A: 0000002A 8BE5 MOV ESP, EBP: 0000002C 5D POP EBP: 0000002D C3 RES Pets: The 0xe8 Is The Call Instruction O pcode. The next DWORD should contain the offset to the Foo function (relative to the CALL instruction). It's pretty clear that Foo probably is not zero bytes away from the CALL instruction. Simply put, this code would not work as expected if you were to execute it. The code is broken, and needs to be fixed up.In the above example of a call to function Foo, there will be a rEL32 fixup record, and it will have the offset of the DWORD that the linker needs To overwrite with the appropriate value.
(* 5 *): View the Relocations: Relocations # 3 Symbol Symbol Offset Type Applied to Index Name ---------------------------------------------------------------------------------------------- ---- ------ -------------- 00000019 REL32 00000000 12? Foo @@ yaxxz (void __cdecl foo (void) 00000026 REL32 00000000 13 __chkespthis (first) fixup Record Says That The Linker Needs to Calculate The Relative Offset To Function Foo, And Write That Value To Offset Four In The section.
(* 6 *): Real A.EXE code section :: 00401000 55 Push ebp: 00401001 8bec Mov EBP, ESP: 00401003 83EC40 SUB ESP, 00000040: 00401006 53 Push EBX: 00401007 56 Push ESI: 00401008 57 Push ESI: 00401008 57 Push ESI: 00401008 57 Push ESI: 00401008 57 PUSH ESI: 00401007 57 00401009 8D7DC0 lea edi, dword ptr [ebp-40]: 0040100C B910000000 mov ecx, 00000010: 00401011 B8CCCCCCCC mov eax, CCCCCCCC: 00401016 F3 repz: 00401017 AB stosd: 00401018 E813000000 call 00401030: 0040101D 5F pop edi: 0040101E 5E pop esi: 0040101F 5B POP EBX: 00401020 83C440 Add ESP, 00000040: 00401023 3BEC CMP EBP, ESP: 00401025 E846000000 Call 00401070: 0040102A 8BE5 MOV ESP, EBP: 0040102C 5D POP EBP: 0040102D C3 RET
: 0040102E CC INT 03: 0040102F CC INT 03 // No content omitted in the middle. * Referenced by a CALL at Address: |: 00401018 |: 00401030 55 push ebp: 00401031 8BEC mov ebp, esp: 00401033 83EC40 sub esp, 00000040: 00401036 53 push ebx: 00401037 56 push esi: 00401038 57 push edi: 00401039 8D7DC0 lea edi, dword ptr [ebp-40]: 0040103C B910000000 mov ecx, 00000010: 00401041 B8CCCCCCCC mov eax, CCCCCCCC: 00401046 F3 repz: 00401047 AB stosd: 00401048 68ECC04000 push 0040C0EC: 0040104D E85E000000 call 004010B0: 00401052 83C404 add esp, 00000004: 00401055 5f pop eDI: 00401056 5E POP ESI: 00401057 5B POP EBX: 00401058 83C440 Add ESP, 00000040: 0040105B 3BEC CMP EBP, ESP: 0040105D E80E000000 Call 00401070: 00401062 8Be5 M OV ESP, EBP: 00401064 5D POP EBP: 00401065 C3 RET
(* 7 *) Look at the contents of foo.obj: (from the offset of the coded code in Fooobj.txt to 0x000003BF, thus compile with W32DASM.): 00000000 55 Push ebp: 00000001 8bec Mov EBP, ESP: 00000003 83EC40 sub esp, 00000040: 00000006 53 push ebx: 00000007 56 push esi: 00000008 57 push edi: 00000009 8D7DC0 lea edi, dword ptr [ebp-40]: 0000000C B910000000 mov ecx, 00000010: 00000011 B8CCCCCCCC mov eax, CCCCCCCC: 00000016 F3 repz : 00000017 AB stosd: 00000018 6800000000 push 00000000: 0000001D E800000000 call 00000022: 00000022 83C404 add esp, 00000004: 00000025 5F pop edi: 00000026 5E pop esi: 00000027 5B pop ebx: 00000028 83C440 add esp, 00000040: 0000002B 3BEC cmp ebp, esp : 0000002D E800000000 Call 00000032: 00000032 8Be5 MOV ESP, EBP: 00000034 5D POP EBP: 00000035 C3 RET Fundamentally: When the connector is integrated with each compilation unit (.Obj), the data required to be adjusted is as needed in A.Obj and foo.obj, such as A. Obj FOO function location, namely: 00000018 E800000000 Call 0000001D // Attention !!! Raw Data # 3 000,000: 55 8b EC 83 EC 40 53 56 57 8D 7D C0 B9 10 00 00 U .... @ svw.} ..... 0000010: 00 B8 CC CC CC CC F3 AB E8 00 00 00 5F 5E 5B ............. ^ [00000020: 83 C4 40 3B EC E8 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 8b E5 5D C3 .. @; ........].
Keep backing with Relocations # 3 Symbol Symbol Offset Type Applied To Index Name ------------------------- ------ -------------- 00000019 REL32 00000000 12? Foo @@ yaxxz (void __cdecl foo (void)) When the connection is connected, the connector integrates the code section, and puts the foo.obj's code. A.Obj's code section. The following :: 00401000 55 Push EBP ....: 00401018 E813000000 CALL 00401030 ....: 0040102D C3 RET: 0040102E CC INT 03: 0040102F CC INT 03 // There is no content omitted in the middle. * Reference by a call at address: |: 00401018 |: 00401030 55 PUSH EBP ....: 00401065 C3 RET where 00400000 in Call 00401030 is the code preferentially loaded in the base. The 1300,00000 in E813000000 is an offset value. Indeed is 00000013, which is characteristic of INTEL CPU of a peculiarity of Intel processors where numerical data is stored in reverse order to character data.To copy a 32 bit value (56 A7 00 FE) into the eax register, you will find the opcode , A1 (MOV EAX) FOLLOWED by (Fe 00 A7 56). A1 Fe 00 A7 56
Jump from offset 00401018 to 00401030. This can be denoted by: E813000000 Handmade Algorithm: Because the Call instruction itself occupies 5 bytes (1 for Call NMemonic (E8), the other is offset value). And 0040101D-00401018 = 5, the offset is actually counted from 0040101D. Therefore 00401030-0040101D = 13 So the Call instruction generated is e813000000 by means of software: opcoder - is written by Cool McCool. perfectly worked.