From the perspective of programmers ELF
Original: "Elf: from the programmer's inpective"
Author: Hongjiu Lu
NYNEX Science & Technology, Inc.
500 Westchester Avenue
White Plains, NY 10604, USA
Translation: alert7 Alert7@xfocus.org > Home: http://www.xfocus.org Time: 2001-9-10 ★ Summary: This document discusses Linux ELF binary format from the perspective of programmers. Introduced some ELF execution Techniques for files in operation. Show how to use dynamic connectors and how to dynamically load ELF. We also demonstrate how to create shared sharing using GNU C / C compilers and some other tools in Linux. C / C library. ★ 1 preface Initially, UNIX System Laboratory (USL) has developed and released Executable and Linking Format (ELF) Such a binary format. On SVR4 and Solaris 2.x, it is the default file default. Binary format. ELF is more powerful and more flexible than A.out and Coff. Combine some appropriate tools, programmers Use ELF to control the process of the program at runtime. ★ 2 ELF type Three main ELF file types: Performable file: contains code and data. Has an executable program. For example, such a program # File dltest DLTEST: ELF 32-BIT LSB EXECUTABLE, Intel 80386, Version 1, DynamicalLinked (Uses Shared Libs), Not Stripped Reliable file: contains code and data (these data is with other relocation files and sharing Use the Object file to connect together) For example, such a file # File lobfoo.o Libfoo.O: ELF 32-BIT LSB RELOCATABLE, Intel 80386, Version 1, NOT STRIPPED Share Object file (also called shared library): contains code and data (these data is connected This time is used by the connector LD and the running dynamic connector). Dynamic connector may be called LD.SO.1, Libc.so.1 or LD-Linux.so.1. For example, such a file # File lobfoo.so Libfoo.so: ELF 32-BIT LSB Shared Object, Intel 80386, Version 1, Not Stripped The ELF Section section is very useful. Use some correct tools and technology, programmers can Skilled operation of executable files. ★ 3 .init and .fini sections On the ELF system, a program consists of executable or add some shared Object files. In order to perform such programs, the system uses those files to create a process of memory. Process image There are some segments that contain executable instructions, data, and more. In order to make an ELF file Loading to memory, there must be a program header (this Program Header is a description section) The structure arrays of information and some information prepared for program operations). One paragraph may have multiple sections. These steps are more important in programmers perspective. Each executable or shared Object file generally contains a section table, which is It is a structural array describing sections in the ELF file. There are several comparisons defined in the ELF documentation. Special Sections. These are especially useful for procedures: .fini This section saves the process termination code instruction. Therefore, when a program is normal to exit, The system arranges the code in this section. .init This section saves an executable instruction that constitutes the initialization code of the process. Therefore, when a program starts running, before the main function is called (C language is called Main), the system arranges the code in this section. The presence of .init and .fini sections have special purpose. If a function is placed .init section, the system will execute before the main function is executed. Similarly, if one The function is placed in .fini section, the function will be executed after the main function returns. This feature is used by the C compiler to complete the global constructor and destructive function. When the ELF executable is executed, the system will be loaded before the control is handed over to the executable file. Share Object file. Construct the correct .init and .fini sections, constructor, and destructor Will be called in the correct order. ★ 3.1 Global constructor and destructive function in C The global constructor and the destructive function in C must be very careful to process the language specification issues. The constructor must be called before the main function. The destructor must be returned after the main function Call. For example, in addition to the general two auxiliary start files Crti.O and CRTN.O, GNU C / C The compiler - GCC also provides two auxiliary startup files, called Crtbegin.o, and one is called CRTEND.O. Combine .ctors and .dtors two section, C global constructor and destructive functions It can be performed in the right order at runtime. .ctors This section saves the pointer array of the overall constructor of the program. .dtors The section saves the pointer array of the overall sectors of the program. CtrBegin.o There are four sections: 1 .ctors section Local Number__ctor_list__ points the pointer of the global constructor. in This array in CtrBegin.o has only one Dummy element. [Translation: # Objdump -s -j .ctors /usR/lib/gcc-lib/i386-redhat-linux/EGCS-2.91.66/CRTBEGIN.O /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/CRTBEGIN.O: File Format ELF32-I386 CONTENTS OF Section .ctors: 0000 fffffff .... The Dummy element that is said here should refer to ffffffffff ] 2.dttors section Local Number__dtor_list__ Points the number of pointers of the whole sectoral destructor. in This array in CtrBegin.o has only only one Dummy element. 3 .Text Section Only the __do_global_dttors_aux function, the function traverses __dtor_list__ List, call each destructor in the list. The function is as follows: Disassembly of section .text: 00000000 <__ do_global_dttors_aux>: 0: 55 push% EBP 1: 89 E5 MOV% ESP,% EBP 3: 83 3D 04 00 00 00 CMPL $ 0x0,0x4 A: 75 38 JNE 44 <__DO_GLOBAL_DTORS_AUX 0x44> C: EB 0f JMP 1D <__ do_global_dttors_aux 0x1d> E: 89 F6 MOV% ESI,% ESI 10: 8D 50 04 Lea 0x4 (% EAX),% EDX 13: 89 15 00 00 00 MOV% EDX, 0x0 19: 8B 00 MOV (% EAX),% EAX 1b: ff d0 call *% EAX 1D: A1 00 00 00 00 MOV 0x0,% EAX 22: 83 38 00 CMPL $ 0x0, (% EAX) 25: 75 E9 JNE 10 <__ do_global_dttors_aux 0x10> 27: B8 00 00 00 MOV $ 0x0,% EAX 2c: 85 c0 test% EAX,% EAX 2E: 74 0A JE 3A <__ do_global_dttors_aux 0x3a> 30: 68 00 00 00 Push $ 0x0 35: E8 FC FF FF FF CALL 36 <__ do_global_dttors_aux 0x36> 3A: C7 05 04 00 00 00 01 MOVL $ 0X1, 0X4 41: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 44: C9 Leave 45: C3 RET 46: 89 F6 MOV% ESI,% ESI 4 .fini section It only contains a function call for __do_global_dtors_aux. Remember, it is only A function call does not return, because crtbegin.o is .fini section is this Part of the function body. The function is as follows: Disassembly of section .fini: 00000000 <.fini>: 0: E8 FC FF FF FF Call 1 <.fini 0x1> CRTEND.O There are also four sections: 1 .ctors section Local Reference Number__ctor_end__ points to the tail of the pointer of the global constructor. 2.dttors section LOCAL label __dtor_end__ points to the tail of the pointer of the whole sectoral destructor. 3 .Text Section Only contain the __do_global_ctors_aux function, the function traverses __ctor_list__ A list, call each constructor in the list. The function is as follows: 00000000 <__ DO_GLOBAL_CTORS_AUX>: 0: 55 push% EBP 1: 89 E5 MOV% ESP,% EBP 3: 53 PUSH% EBX 4: BB FC FF FF MOV $ 0xffffffffc,% EBX9: 83 3D FC FF FF FF CMPL $ 0XFFFFFFF, 0xFffffFFC 10: 74 0C JE 1e <__ do_global_ctors_aux 0x1e> 12: 8B 03 MOV (% EBX),% EAX 14: FF D0 Call *% EAX 16: 83 C3 FC Add $ 0xffffffffc,% EBX 19: 83 3B ff cmpl $ 0xfffffffff, (% EBX) 1c: 75 F4 JNE 12 <__ do_global_ctors_aux 0x12> 1e: 8B 5D FC MOV 0xFffffffc (% EBP),% EBX 21: C9 Leave 22: C3 RET 23: 90 NOP 4 .init section It only contains a function call for __do_global_ctors_aux. Remember, it is only A function call does not return, because crtend.o .init section is this letter Some of the plurality. The function is as follows: Disassembly of section .init: 00000000 <.init>: 0: E8 FC FF FF FF Call 1 <.init 0x1> Crti.o Only the function label of the _init in .init section. The _fini function label in .fini section. Crtn.o It is only returned in .init and .fini section. Disassembly of section .init: 00000000 <.init>: 0: 8B 5D FC MOV 0xFffffFFFC (% EBP),% EBX 3: C9 Leave 4: C3 RET Disassembly of section .fini: 00000000 <.fini>: 0: 8B 5D FC MOV 0xFffffFFFC (% EBP),% EBX 3: C9 Leave 4: C3 RET When compiling produces a repositionable file, GCC hangs each global constructor on __ctor_list (Put the pointer pointing to the constructor in. Graction). It also hangs each full-local sector letter on __dtor_list (by pointer to the analysis Put it in .dtors section). When connecting, GCC processes Crtbegin.o before all repositioning files, processing after all repositioning files CRTEND.O. In addition, crti.o was processed before crtbegin.o, crtn.o after crtend.o Has been processed. When generating an executable file, the CTORs of the connector LD is connected to all recreasing files, respectively. .dttors section to __ctor_list__ and __dtor_list__ list. .init section consists of all the _init functions in all relocationable files. .fini consists of _fini functions. When running, the system will execute the _init function before the main function, and execute after the main function returns _fini function. ★ 4 ELF dynamic connection and loading ★ 4.1 Dynamic Connection When the C source code is compiled into an executable file with the C source code in the UNIX system, the C compile driver is generally The preprocessor, compiler, assembler, and connector will be called. The C-compilation driver first transmits the C source code to a preprocessor of C, which handles the macro and Indicator outputs pure C language code. The C compiler translate the handled C language code into machine-related assembly code. The assembler translates the assembly language code of the result into the target of the machine. These The machine instruction is stored into the designated binary format, here, we use ELF format. In the final stage, the connector connects all Object files, join all startup code and The library function referenced in the program. There are two ways to use the lib library. --STATIC LIBRARY A collection contains the library routines and data included in the object file. use This method, the connector will generate a separate Object file when the connector is connected (these The Object file saves the COPY of the function and data to be referenced by the program. --Shared Library It is a shared file that contains functions and data. Programs connected in this way are only available The name of the shared library and some programs referenced are stored in the program. Dynamic during operation Connector (also called program interpreter in ELF) will put shared library images to the process of virtual In the address space, the reference numerals in the shared library are parsed through the name. The process is also known as Dynamic linking The programmer does not need to know what the shared library used in dynamically connect, everything is the programmer transparent. ★ 4.2 Dynamic Loading Dynamic loading is such a process: put the shared library to the address space of the process, look up in the library The address of the function is then called that function, and uninstall the shared library when no longer needs. Its execution The process is a dynamically connected service interface. Under ELF, the program interface is typically defined in Void * Dlopen (Const Char * FileName, INT FLAG); Const char * DLERROR (Void); Const void * DLSYM (Void Handle *, Const Char * Symbol); INT DLCLOSE (Void * Handle); These functions are included in Libdl.so. Here is an example, showing how dynamic loading works. The main program is running the shared library during runtime. On the one hand, it can point out which shared library is used, which one The function is called. On the one hand, you can also access the data in the shared library. [Alert7 @ redhat62 DL] # cat dltest.c #include #include #include #include #include TypeDef void (* func_t) (const char *); Void dltest (const char * s) { Printf ("from dltest:"); For (; * s; s ) { Putchar (TouPper (* S)); } PUTCHAR ('/ n'); Main (int Argc, char ** argv) { void * handle; FUNC_T FPTR; Char * libName = "./libfoo.so"; Char ** name = NULL; Char * funcname = "foo"; Char * param = "Dynamic Loading Test"; int CH; INT mode = RTLD_LAZY; While ((ch = getopt (Argc, Argv, "A: B: F: L:"))! = EOF) { Switch (CH) { Case 'a': / * argument * / Param = OPTARG; Break; Case 'b': / * how to bind * / Switch (* OPTARG) { Case 'L': / * lazy * / Mode = RTLD_LAZY; Break; Case 'n': / * now * / Mode = RTLD_NOW; Break; } Break; Case 'L': / * Which Shared Library * / LibName = OPTARG; Break; Case 'F': / * Which Function * / FuncName = OPTARG; } } Handle = DLOPEN (libName, Mode); IF (handle == null) { FPRINTF (stderr, "% s: dlopen: '% s' / n", libName, DLERROR ()); Exit (1); } FPTR = (FUNC_T) DLSYM (Handle, FuncName); IF (fptr == null) { FPRINTF (stderr, "% s: DLSYM: '% s' / n", funcname, dLerror ()); Exit (1); } Name = (char **) DLSYM (Handle, "Libname"); IF (Name == Null) { FPRINTF (stderr, "% s: DLSYM: 'libname' / n", DLERROR ()); Exit (1); } Printf ("Call '% S' IN '% S': / N", FUNCNAME, * NAME); / * Call That Function with 'param' * / (* fptr) (param); DLClose (Handle); Return 0; } There are two shared libraries here, one is Libfoo.so one is libbar.so. Each is the same global String variable libName, each has a Foo and Bar functions, respectively. Through DLSYM, the program, they It is available. [Alert7 @ redhat62 DL] # Cat libbar.c #include Extern void dltest (const char *); Const char * const libName = "libbar.so"; Void Bar (const char * s) { "" Called from libbar. "); Printf ("Libbar:% S / N", S); } [Alert7 @ redhat62 DL] # Cat Libfoo.c #include Extern void dltest (const char * s); Const char * const libName = "libfoo.so"; void foo (const char * s) { Const char * saved = s; DLTEST ("Called from Libfoo"; Printf ("Libfoo:"); For (; * s; s ); For (s -; s> = saved; s -) { Putchar (* s); } PUTCHAR ('/ n'); } It is useful to compile shared libraries and main programs using your Makefile file. Because Libbar.so and Libfoo.so also calls the DLTest function in the main program. [Alert7 @ redhat62 dl] #cat makefile CC = GCC LDFLAGS = -rdynamic SHLDFLAGS = Rm = rm All: DLTest Libfoo.o: libfoo.c $ (Cc) -c-fpic $ Libfoo.so:ludlibfoo.o $ (Cc) $ (shldflags) -shared -o $ @ $ ^ Libbar: libbar.c $ (Cc) -c-fpic $ Libbar.so: Luibbar.o $ (Cc) $ (shldflags) -shared -o $ @ $ ^ DLTEST: DLTEST.O LIBBAR.SO LIBFOO.SO $ (Cc) $ (ldflags) -o $ @ dltest.o -ldl Clean: $ (Rm) * .o * .so dltest Process: [Alert7 @ redhat62 DL] # export Elf_LD_Library_path =. [Alert7 @ redhat62 DL] # ./dltest Call 'foo' in 'libfoo.so': From dltest: Called from Libfoo Libfoo: Tset Gnidaol Cimanyd [Alert7 @ redhat62 dl] # ./dltest -f bar BAR: DLSYM: './ libfoo.so: undefined symbol: bar' [alert7 @ redhat62 dl] # ./dltest -f bar -l./libbar.so Call 'Bar' in 'Libbar.so': From dltest: Called from libbar. Libbar: Dynamic Loading Test The first function called in the dynamic load process is DLOPEN, which makes sharing library pairs The process running is available. DLOPEN returns a handle, the handle is later DLSYM Use the DLClose function. Dlopen's parameters have special meaning for null - it makes Program Exported Number and Current Shared Library Exported to Memory Export By DLSYM It is available. After a shared library has been loaded into the process space of the process, DLSYM can be used Get the label address exported in the shared library. Then you can return the address returned by DLSYM To access the functions and data in it. When a shared library is no longer needed, you can call DLClose to uninstall the library. If the shared library is loaded at the startup time or by other DLOPEN calls, The shared library will not be removed from the address space of the calling process. If the DLClose operation is successful, it returns to 0. DLOPEN and DLSYM will return if there is an error NULL. In order to obtain diagnostic information, you can call DLERROR. ★ 5 Compiler GNU GCC on Linux supporting ELF Thanks Eric Youngdale (Eric@aib.com), Lan Lance Taylor (ian@cygnus.com) also has a person who contributes to the silence of GCC supporting ELF. We can use GCC and GNU binary The tool is easy to create an ELF executable and shared library. ★ 5.1 Sharing C Library Shared C Library Constructing a shared library under ELF is much easier than others. But need a compiler, assembler, Support for connectors. First, there is a need to generate a position-independent code. To do this, GCC needs to add compilation options -fpic [Alert7 @ redhat62 DL] # gcc-fpic -o -c libbar.c At this time, it is suitable for building shared libraries, plus -shared compile options [Alert7 @ redhat62 DL] # gcc -shared -o libbar.so libbar.o Now we constructed libbar.so can be connected by connectors (Link Editor) and Dynamic Connectors. Dynamic Linker. As long as compiling the -fpic compilation option, you can block many relocations Documents are added to the shared library. In order to connect the Baz.o and shared libraries, you can do this: # Gcc -o -c baz.c # Gcc -o baz baz.o -l. -Lbar Run BAZ after the Libbar.so is installed to the correct position that the dynamic connector can find Make libBar.so images to BAZ process address space. One copy of libbar.so in memory All executable (these executable program is connected to it or one of them or Dynamically loaded during runtime) sharing. ★ 5.2 Shared C Library Shared C Library The main difficulty in sharing C libraries is how to treat constructors and destructor. Under SunOS, constructing and using a shared ELF C library is easy, but can't be in SunOS Construct a shared C library because the constructor and the destructive function have special needs. So, in ELF It provides a perfect solution in .init and .init section. When constructing a shared C library, we use Crtbegin.o and CRtend.o, these two special versions, (They are already -fpic). For connectors (Link Editor), constructive The C library is almost the same as the general executable. Global constructor and destructive function Has been treated by .init and .fini section (already discussed in Top 3.1 above). But when a shared library is mapped to the address space of the process, the dynamic connector will be in the transmission control. Before the _init function, and will arrange the _fini function to the shared library is no longer needed carried out. Connection options -shared is telling GCC to place the necessary secondary files in the correct order and tell it Generate a shared library. -v option will display what file is transmitted to the connector Link Editor. [Alert7 @ redhat62 DL] # gcc -v -shared -o libbar.so libbar.o Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/EGCS-2.91.66/specs GCC VERSION EGCS-2.91.66 19990314 / Linux (EGCS-1.1.2 Release) /usr/lib/gcc-lib/i386-redhat-linux/EGCS-2.91.66/COLLECT2 -M ELF_I386 -shared -o libbar.so /usr/lib/crti.o / usr / lib / gcc-lib / i386-redhat -Linux / EGCS-2.91.66 / CRTBEGINS.O-L / USR / LIB / GCC-LIB / I386-RedHat-Linux / EGCS-2.91.66 -L / usr / i386-redhat-linux / lib liblibar.o -lgcc -lc --version-script /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/libgcc.map -lgcc /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/ Crtends.o /usr/lib/crtn.o Crtborgins.o and CRtends.o Two special versions compiled with -fpic. Band -shared Creating a shared library is important because those auxiliary files also provide other services. We will Discuss in Section 5.3. ★ 5.3 extension GCC features GCC has many extension features. Some useful for ELF is particularly useful. One is __ATtribute__. Use __ATtribute__ to put a function in __ctor_list__ or __dtor_list__. E.g: [Alert7 @ redhat62 DL] # cat miss.c #include #include Static void foo (void) __ATtribute__ (condition); Static void bar (void) __ATtribute__ ((Destructor)); Int main (int Argc, char * argv []) { Printf ("FOO ==% P / N", FOO); Printf ("BAR ==% P / N", BAR); Exit (exit_success); } Void foo (void) { Printf ("Hi Dear Njlily! / N"); } Void Bar (Void) { Printf ("Missing U! Goodbye! / N"); } [Alert7 @ redhat62 DL] # gcc -o miss miss.c [alert7 @ redhat62 dl] # ./miss Hi dear njlily! Foo == 0x8048434 Bar == 0x8048448 Missing u! goodbye! Let's take a look at whether it is added .ient and .dtors. [Alert7 @ redhat62 DL] # Objdump -s -j .ctors miss MISS: File Format ELF32-I386 CONTENTS OF Section .ctors: 8049504 fffffffffff 34840408 00000000 .... 4 ....... [Alert7 @ redhat62 dl] # Objdump -s -j .dttors Miss MISS: File Format ELF32-I386 CONTENTS OF Section .dtors: 8049510 ffffffff 48840408 00000000 .... h ....... The Foo and Bar addresses have been placed in .ctors and .dors, showing 34840408 just because The x86 is a LSB encoded, a small end. __ATTRIBUTE__ (CONSTRUCTOR)) The function foo will be automatically called automatically before entering main. __ATTRIBUTE__ (DESTRUCTOR)) Promoting the function bar after the main return or EXIT call Will be called automatically. Foo and bar must be a parameter and must be a function of static void type. Under ELF, this feature works well in the general executable and shared libraries. We can also create your own section, here I created an Alert7 Section. [alert7 @ redhat62 DL] # cat test.c #include #include Static void foo (void) __ATtribute__ ((Section ("Alert7"))) Static void bar (void) __ATtribute__ ((Section ("Alert7")); Int main (int Argc, char * argv []) { Foo (); Printf ("FOO ==% P / N", FOO); Printf ("BAR ==% P / N", BAR); Bar (); Exit (exit_success); } Void foo (void) { Printf ("Hi Dear Njlily! / N"); } Void Bar (Void) { Printf ("Missing U! Goodbye! / N"); } [alert7 @ redhat62 dl] # gcc -o test test.c [Alert7 @ redhat62 DL] # ./test Hi dear njlily! Foo == 0x804847c Bar == 0x8048490 Missing u! goodbye! [Alert7 @ redhat62 dl] # Objdump -x test .... SECTIONS: IDX Name Size VMA LMA File Off Algn 0.INTERP 00000013 080480F4 080480F4 000000F4 2 ** 0 Contents, Alloc, Load, Readonly, Data ... 12 Alert7 00000026 0804847C 0804847C 00047C 2 ** 2 Contents, Alloc, Load, Readonly, CODE ... [Alert7 @ redhat62 DL] # Objdump -d test Disassembly of Section Alert7: 0804847C 804847c: 55 push% EBP 804847D: 89 E5 MOV% ESP,% EBP 804847F: 68 de 84 04 08 Push $ 0x80484de 8048484: E8 A3 Fe FF FF CALL 804832C <_init 0x70> 8048489: 83 C4 04 Add $ 0x4,% ESP 804848c: C9 Leave 804848D: C3 RET 804848E: 89 F6 MOV% ESI,% ESI 08048490 8048490: 55 push% EBP8048491: 89 E5 MOV% ESP,% EBP 8048493: 68 EF 84 04 08 Push $ 0x80484ef 8048498: E8 8F Fe FF FF CALL 804832C <_init 0x70> 804849D: 83 C4 04 Add $ 0x4,% ESP 80484A0: C9 Leave 80484A1: C3 RET Here, I created my own Alert7 Section, and put the foo, Bar two functions. in section. The general definition function is placed in .Text Section. ★ 5.3.1 Initialization functions in the C library Another GCC is characterized by __ATtribute __ (section ("sectionname")). Use this, You can put a function or a data structure in any section. Static void FOO (int Argc, char ** argc, char ** ENVP) __ATTRIBUTE__ ((Section ("_libc_foo"))); Static void FOO (int Argc, char ** argv, char ** ENVP) { } Static void Bar (int Argc, char ** argv, char ** ENVP) { } Static void * __libc_subinit_bar__ __ATTRIBUTE__ ((Section ("_libc_subinit"))))))))) = & (bar); Here, we put foo in _Libc_foo section, put __libc_subinit_bar__ In _Libc_Subinit Section. In the Linux C library, _libc_subinit is a special Section, which contains an array of function pointers (with the following prototype). Void (*) (int Argc, char ** argv, char ** ENVP); The ARGC, Argv, Envp here is the same meaning in main. The function in this section is entering The main function will be called before. This is useful, can be used to initialize some in the Linux C library Global variable. [Translation: _libc_subinit section really has this special function? I have not tried Success, if someone tries to succeed or think I understand the wrong place, I will remember Mail give I:) The test procedure is as follows: #include #include Static void foo (int Argc, char ** argv, char ** ENVP) { Printf ("Hi Dear Njlily! / N"); } Int main (int Argc, char * argv []) { Printf ("FOO ==% P / N", FOO); Exit (exit_success); } Static void * __libc_subinit_bar__ __ATTRIBUTE__ ((Section ("_libc_subinit"))))))) = & (foo); [Alert7 @ redhat62 DL] # gcc -o test1 test1.c [alert7 @ redhat62 dl] # ./test1 Foo == 0x8048400 : (With Objdump, displaying a _libc_subinit section has been created, and The first four bytes of this section are foo address 0x8048400 ] ★ 5.4 Using GCC and GNU LD The options for this command line are especially useful when creating ELFs for GCC and GNU LD. -shared tells GCC Generate a shared library that can form an executable when connecting to other shared files while connecting Document, the shared library can also load the address space of the executable file at runtime. Use -shared It is a preferred method for creating a shared ELF library. Another useful command line option is -wl, ldoption, transfer parameter ldoption as a connector Options. If ldoption contains multiple commas, it will be separated into multiple options. The -static option will generate an executable of a STATIC library. Did not open When the -static option, the connector tries to use the shared library, if the shared version is not available, then Try with static libraries. Some special command line options are especially useful for ELF. -Dynamic-linker file Set the name of the dynamic linker. Default Dynamic Connector Or /usr/lib/libc.so.1 or /usr/lib/libd1.so.1 -EXPORT-DYNAMIC Tell the connector to make the dynamic connector available in all the labels in the executable. As one Dynamically loaded shared library reference to the label in the executable file, the label is generally connected This is especially useful when you are not available. -LFile Add files to the list you need to connect. This option can be used in many times. LD will search it Path-List lookup file libfile.so (that is, if the library is libaar.so, then Use, -lbar), or libfile.a as used. In some cases, the shared library name libfile.so will be stored in Resulting Executable Or shared libraries. When the resulting executable or shared library is loaded into To save, the dynamic connector will also load the recorded shared library to the address space of the process. In the case of future things, copy the necessary functions and data to the executable, reduce Less code length. -M emulation The imulation connector R.-V parameter can list all available options. -M | -map mapfile Use the connection map to the standard output or a mapfile file, the connection MAP contains Regarding some diagnostic information that is called by the LD image, there is also a global common storage Assign information. -RPATH DIRECTORY Add a directory to the search path of the Runtime Library. All -RPATH parameters are connected The dynamic connector is then passed together. They are used to locate shared libraries at runtime. -Soname Name When you create a shared library, the specified name is placed in the shared library. When and this sharing The executable of the library is running, and the dynamic connector will try the specified by MAP record. The shared library of the name is not the file name that is transmitted to the connector. -static Tell the connector not to connect with any shared library. -Verbose Tell the connector to print each of its to open the file name. LINUX GCC Beta Version Use the -dynamic-linker file option to set the dynamic connector for /LIB/ld-Linker.so.1. This option can make ELF and A.out shared libraries well coexist. Additional things are still interested in people. [Alert7 @ redhat62 dl] # gcc -shared -o libbar.so libbar.o -lfoo If libfoo.so is used to create a shared library, it will happen when it is interesting. When libbar.so The dynamic connector also puts the libfoo.so map to memory when the image is imaged to the process. This feature is useful when libfoo.so needs libfoo.so. Actually use The libbars.o library is not required to compile. If the Archive version of Libfoo.a It is used when the label in libbar.a is referenced by libbar.o, it will be searched. Gum Libbar.so contains libfoo.a or even they are not used by libbar.o, in such cases You must gradually add the .o file to libbar.o: # Rm -rf / tmp / foo # MKDIR / TMP / FOO # (CD / TMP / FOO /; ar -x ... / libfoo.a) # Gcc -shared -o libbar.so libbar.o /tmp/foo/*.o # Rm -rf / tmp / foo In Libfoo.a .o file must be compiled with -fpic or at least PIC (location-independent) is Compatible. Use Static void * __libc_subinit_bar__ __ATTRIBUTE__ ((Section ("_libc_subinit"))))))))) = & (bar); Come put a label in a section that is not defined by the connector (here is here _libc_subinit). Connecting all the labels in _libc_subinit section Create two labels, one is __start__libc_subinit and __stop__libc_subinit, They are used as a flag of C. caveat: The following is complete: connector may not search _libc_subinit section Document (there is no program that does not have a request in this section). This makes the program to determine _LIBC_SUBINIT Section can be searched by the connector. A solution is: put a Dummy label in _libc_subinit section, Define it in a file to refer to the reference to _libc_subinit section. ★ 5.5 linux ELF There is a unique feature of the execution of ELF under Linux, which is useful for Linux users. of. Some Linux's own extension is very similar to the execution of Solaris ELF. ★ 5.5.1 ELF Macro (Macros) Elf_Alias (name1, name2) Define a pseudonym Name2 for the label Name1. When the flag name is already defined It should be very useful. Weak_Alias (name1, name2) Define a weakened name Name2 for label Name1. Only when Name2 is not defined anywhere When the connector will parse the Name2 related symbols with Name1. Defined in the file The label Name1 will also be processed. ELF_SET_ELEMENT (SET, SYMBOL) Forced label becomes an element of a set collection. Create a section for each set collection. Symbol_set_declare (SET) Declare a collection set in this module. In fact, two labels: 1 start label for set EXTERN VOID * const __start_set 2 A set of end reference numerals EXTERN VOID * const __stop_set Symbol_set_first_element (set) Returns a pointer (VOID * const *), pointing to the first element to the set set. Symbol_set_end_p (set, ptr) If PTR (void * const *) gradually increases the last element pointing to Set, Return to True. Using these macros, programmers can create a list from different source files. ★ 5.5.2 Library positioning and search path Under Linux, most system Library libraries are installed in the / usr / lib directory. Only some Basic shared libraries are installed in the / lib directory. For example: libc.so, libcurses.so, libm.so LIBTERMCAP.SO (some files corresponding to each version will be somewhat different), on the other parts Before, those files must be required to start the Linux system. The default search path of the connector is / LIB, / USR / LIB, / USR / local / lib, / usr / i486-linux / lib. Environmental variables LD_Library_path also saves directory lists with (:), which is dynamically The connector checks and uses the directory indicated by this variable to find the shared library. For example: / usr / x11r6 / lib: / usr / local / lib: tells the dynamic connector to find shared library except Now look up in the default directory, then in / usr / x11r6 / lib directory, then / usr / local / lib directory, and then the current directory. New environment variables Elf_LD_Library_path plays a similar role similar to LD_Library_Path. Because LD_Library_Path is also used by old A.out DLL Linux shared library. in order to avoid Unnecessary warning from the DLL connector, for the dynamic connector of ELF under Linux, It is best to use the LD_Library_Path environment variable. Another feature is a /etc/ld.so.conf file that contains some directory lists. E.g: / USR / X11R6 / LIB / usr / lib / usr / kerberos / lib / usr / i486-linux-libc5 / lib / usr / lib / gconv /usR/LIB/QT-2.1.0/lib /usr/lib/qt-1.45/lib Program ldconfig will put all the search directories listed in the /etc/ld.so.conf file Shared inventory is stored in /etc/ld.so.cache. If the shared library has been in the default directory Remove, the Linux ELF dynamic connection library will find the shared library in the /etc/ld.so.cache file. ★ 5.5.3 Version of shared libraries On the ELF system, if two shared libraries have the same application binary interface (ABI) subset If you use only those ABI subset, these two shared libraries can be mutually General (of course, the two shared libraries have the same function function). When a library is changed, as long as the new ABI and the previous version of the shared library have 100% compatible words, All procedures for all versions can run well under the new shared library. In order to support this, The FOO library must be careful: 1. This shared library should be constructed as follows: [Alert7 @ redhat62 dl] # gcc -shared -wl, -soname, libfoo.so.major / -o libfoo.so.major.minor.patch-level libfoo.o The dynamic connector will try to locate and images lofoo.so.major regardless of the fact that File name libfoo.so.major.patch-level. 2. A symbol connection should point to the correct shared library [Alert7 @ redhat62 DL] # ln-s libfoo.so.major.minor.patch-level / Libfoo.so.major 3. When the ABI changes and the original version are not compatible, the main (MAJOR) version number should be upgraded. When searching for shared libraries, Linux connector will use the latest shared library (they have the highest MAJOR, MINOR and PATCH LEVEL version number). ★ 5.5.4 Sharing (Shared) library and static (STATIC) buses hybrid connection By default, if the shared library is available, the connector uses the shared library. But -bdynamic and -BSTATIC provides a good way to control the library. They can decide to use the shared library or use a static library. The -bdynamic and -bstatic options are given to the connector, as follows: # Gcc -o main main.o -wl, -bstatic / -Lfoo -wl, -bdynamic -lbar # Gcc -o main main.o -wl, -bstatic Tell the connector all libraries (like libc, etc.) use static versions. ★ 5.5.5 Load additional shared library On the ELF system, in order to perform an ELF file, the kernel is handed over to the dynamic connector. LD-Linux.so.1 (Dynamic connector on Linux is LD-Linux.so.1, version will be different, On the default redhat6.2 is /Lib/ld-linux.so.2). In absolute path /lib/ld-linux.so.1 Store in binary. If the dynamic connector does not exist, no ELF executable can run. Dynamic Connectors do one step to complete the slave program to process image: 1. Analyze the dynamic information section in the executable file, decide which libraries needed. 2. Positioning and Image (MAP) Those shared libraries, and analyze their dynamic information section Decide whether an additional shared library is required. 3. Repelling the executable and the shared libraries needed. 4. Call any initialization function provided in the shared library and arrange the shared library to provide Cleanup function runs when the shared library is easily planted. 5. Transfer control to the program 6. Provide a delay service for the application 7. Provide dynamic reprint services for the application. Environmental Variable LD_PRELOAD Sets the shared library name or ":" separates the file name. Dynamic connector Mount Environment Variable LD_PRELOAD shared library to process addresses before any request sharing library Space go. E.g: # Ld_preeload =. / Mylibc.so myprog Here ./myLibc.so will first map to the program myprog space. Because the dynamic connector is looking for When the search mark is always used, I can use the lacquered markings, so we can use ld_preeload. Cover the function in the standard shared library. This feature is useful for programmers, which can be used Do not build the entire shared library to make a debug experiment for a single function function. We can do this: #gcc -c -fpic -o3 print.c #gcc --shared print.o -o print.so.1.0 Create your own shared connection library ★ 5.5.6 Linux Dynamic Loading (Dynamic Loading) _dlinfo is a function of dynamically connecting the interface library. It lists all mapping to execution programs and pass DLOPEN opens each shared library. Its output class test: List of loaded modules 00000000 50006163 50006200 EXE 1 50007000 5000620C 50006200 LIB 1 /LIB/Elf/Libd1.so.1 5000A000 500062C8 50006200 LIB 2 /LIB/Elf/Libc.so.4 50000000 50006000 000000 INT 1 /Lib/Elf/ld-Linux.so.1 500AA000 08006F00 08005FF0 MOD 1 ./libfoo.so Modules for Application (50006200): 500061635000620C /LIB/Elf/Libdl.so.1 500062C8 /LIB/ELF/LIBC.SO.4 50006000 /Lib/ld-Linux.so.1 Modules for Handle 8005FF0 08006f00 ./libfoo.so 500062C8 /LIB/ELF/LIB.SO.4 50006163 5000620C /LIB/ELF/LIBD1.SO.1 500062C8 /LIB/ELF/LIBC.SO.4 50006000 /Lib/Elf/ld-Linux.so.1 The above can be used to explain the dynamic connection and dynamic loading. GCC configured on the Linux support ELF, if used the -rdynamic option, it will -Export-Dynamic is transmitted to the connector. It is highly recommended to use dynamic loading. Why is this LDFLAGS = -rdynamic is used in our makefile example. Temporarily, this option can only be Used under Linux. But -wl, -export-dynamic can put -Export-Dynam on other platforms. Pass it to the GNU connector. You can find a detailed description in the [3] and [4] section of GNU Link Editor. ★ 6 Location Unrelated Code (PIC) assembly language programming When the -fpic is specified with GCC, the GCC generates the assembly language code of the PIC from the C source code. But there When we need to generate PIC code with assembly language. Under ELF, the implementation of the PIC uses the base register. In PIC, all The label reference is implemented by the base register. To this end, you must save the PIC in accordance with the compilation. Base register. Control the destination address of the transfer instruction due to location independent code It must be replaced or calculated in the case of PIC. For the X86 machine, the base register (Base Register) is EBX. Here we will introduce PIC assembly code on x86 Two methods. These technologies are also used in the Linux C library. ★ 6.1 Embedded in C GCC supports declaration of embedded compilation, allowing programmers to use assembly languages in C language. Write Linux system This is useful when the interpolation interface is not required to use the machine-related instructions. The system call under Linux is via INT $ 0x80. In general, there will be three parameters: #include EXTERN INT errno; Int Read (int FD, Void * BUF, SIZE COUNT) { Long Ret; __asm__ __volatile__ ("INT $ 0x80" : "= a" (RET) : "O" (sys_read), "B" ((long) fd), "C" ((long) BUF), "D" ((long) count: "bx"); IF (RET> = 0) { Return (int) RET: } Errno = -ret; RetRun -1; } The above assembly code puts the system call number sys_read in Eax, FD to EBX, buf to In ECX, in Count to EDX, returning the value RET from int $ 0x80 in EAX. Not use In the case of -FPIC, it is well operating in this definition. GCC with -fpic should check if EBX is Changed and should save and restore EBX in assembly code. But unfortunately, in fact, not Such. In order to support the PIC, we must write the assembly code yourself. #include EXTERN INT errno; Int Read (int FD, Void * BUF, SIZE COUNT) { Long Ret; __ASM__ __Volatile__ ("Pushl %% EBX / N / T" "MOVL %% ESI, %% EBX / N / T" "INT $ 0x80 / N / T" "POPL %% EBX" : "= a" (RET) : "O" (sys_read), "S" ((long) FD), "C" ((long) BUF), "D" ((long) count: "bx"); IF (RET> = 0) { Return (int) RET: } Errno = -ret; Return -1; } Here first put the FD in the ESI, then save EBX, move the ESI to EBX, restore after INT $ 0x80 EBX. This ensures that EBX is not changed (except in the int $ 0x80 interrupt call). The same principle Suitable for other embedded compilation. At any time, when EBX may be changed, remember to save and restore EBX. ★ 6.2 Programming with assembly language If we need to pass 5 parameters when the system is called, the embedded assembly code is even PIC. , You can't work because x86 does not have enough registers. We need to use assembly language directly write. The general assembly code of Syscall (int syscall_number, ...) is as follows: .file "syscall.s" .Text .global syscall .global errno .align 16 Syscall: Pushl 5EBP MOVL% ESP,% EBP Pushl% EDI Pushl% ESI Pushl% EBX MOVL 8 (% EBP),% EAX MOVL 12 (% EBP),% EBX MOVL 16 (% EBP),% ECX MOVL 20 (% EBP),% EDX MOVL 24 (% EBP),% ESI MOVL 28 (% EBP),% EDI INT $ 0x80 TEST% EAX,% EAX JPE .LLEXIT NEGL% EAX MOVL% EAX, Errno MOVL $ -1,% EAX .Llexit: POPL% EBX POPL% ESI POPL% EDI MOVL% EBP,% ESP POPL% EBP RET .type syscall, @ function .L_syscall_end: .size syscall, .l_syscall_end -syscall In PIC, we must access any global variables through GOT (Global Offset Table) (In addition to saving in the base register EBX). The modified code is as follows: .file "syscall.s" .Text .global syscall .global errno .align 16 Syscall: Pushl% EBP MOVL% ESP,% EBP Pushl% EDI Pushl% ESI Pushl% EBX Call .ll4 .Ll4: POPL% EBX AddL $ _GLOBAL_OFFSET_TABLE _ [.- .ll14],% EBX Pushl% EBX MOVL 8 (% EBP),% EAX MOVL 12 (% EBP),% EBX MOVL 16 (% EBP),% ECX MOVL 20 (% EBP),% EDX MOVL 24 (% EBP),% ESI MOVL 28 (% EBP),% EDI INT $ 0x80 POPL% EBX MOVL% EAX,% EDX TEST% EDX,% EDX JGE .llexit NEGL% Edxmovl errno @ got (% EBX),% EAX MOVL% EDX, (% EAX) MOVL $ -1,% EAX .Llexit: POPL% EBX POPL% ESI POPL% EDI MOVL% EBP,% ESP POPL% EBP RET .type syscall, @ function .L_syscall_end: .size syscall, .l_syscall_end-syscall If you want to get the PIC's assembly code, but don't know how to write, you can write a C, then as follows Compile: #gcc -o -fpic -s foo.c It will tell GCC to generate assembly code foo.s output, can modify it as needed. ★ 7 Conclusion According to the above discussion, we can conclude that ELF is a very flexible binary format. It provides a very useful feature. This specification does not give the procedures and programmers too much. it makes Creating a shared library is easy to make the combination of dynamic loading and shared libraries easier. Under ELF, in C In the global constructor and the destructor, in the shared library and static library. [Translation: At this point, the article is translated, but some things inside look A little problem, such as _libc_subinit section does not say he said That function, the -dynamic-linker option is in the default Redhat 6.2 system Can't use, _dlinfo dynamic connection interface library function seems to be in Linux Wait a series of questions, welcome to discuss Mailto: alert7@21cn.com Alert7@xfocus.org ] reference: 1. Operating System API Reference: UNIX SVR4.2, UNIX Press, 1992 2. Sunos 5.3 Linker and Libraries Manual, Sunsoft, 1993. 3. Richard M.stallman, Using and Porting GNU CC for Version 2.6, Free Software Foundation, September 1994. 4. Steve Chamberlain and Roland Pesch, USING LD: The GNU Linker, LD Version 2, Cygnus Support, January 1994.