Dynamic link of ELF file dynamic link in Linux under Intel platform (1): Load

xiaoxiao2021-03-06  59

content:

History code example DL_Open loading process analysis Appendix A: Dynamic Link Section Type and Description Appendix B: Dynamic Link Library Program HEADER Type Description Reference About the author

Loading, parsing, and instance analysis of ELF file dynamic links in Linux under Intel platform (1): Loading is reproduced from: IBM DeveloperWorks China website Wang Ruichuan (Jeppeterone@163.com) Dynamic link in October 2003, a topic that is often filed . However, in this regard, there are few articles to clarify this important software operation mechanism, only some articles about dynamic link library programming. This series of articles is to explore this issue from the level of dynamic link library source code. Of course, you can see the topic of the article, the dynamic link of the Linux ELF file under the Intel platform. One is because of this aspect of information, the second is also the meaning of this discussion is more important than other dynamic links (after all, now is Intel's world). Of course, there is such an example, the dynamic link of the ELF file under other platforms is similar. After reading this article, you can read this article. Since this is a series of articles, I plan to write three parts, the first part is mainly analyzed, involving the content of the DL_Open, but because this function is too much. Here is the two parts of _dl_map_object and _dl_init, because here is a special initialization in the _dl_init by mapping the dynamic link file to the memory space through the information in the ELF file, and _dl_init is a special initialization. This is achieved for object-oriented functions. The second part I will analyze the function analysis and uninstallation, which will be more, but there will be more content. The first is the two function contents involved in the _dl_map_object_deps and _dl_bit_object_deps and _dl_relocate_object in DL_Open, because these are directly related to the content parsed by the function, so arrange it here. The following function parsing process _dl_Runtime_Resolve is a dynamic parsing process in the program run. It doesn't have much code from essentially, but its skill is the most (it is the core of my three articles). Finally, it is an implementation of DL_Close. Here is an ending work, which is the error exception processing of _dl_signal_cerror, with _dl_catch_error. The third part will give the INJECTSO instance analysis and application, which will introduce an instance applying a dynamic link, and can use the INJECTSO instance used during future program debugging, which can not only let us have the previous dynamic link A more sensual understanding, and for this example, you can also use a dynamic patch tool in the future code development process, and even possibly, I will use this tool in the later article. Technology. I. History issues about dynamic links, can be said for a long time. If you trace, the earliest thought is in the fifth year. At that time, I wanted to put some public code in one place in memory, and it was Call in other addresses. Later, it has developed to Loading Overlays (that is, the code that is different in the program running the life period is added to the memory), which is in the 1960s. But this can only be considered a "abuse" period. Close to the dynamic link we now say is after UNIX operating system, because from UNIX design structure, it is divided into modules to implement a complex functional operating system.

But these are not dynamic links in the modern sense, the reason is that the dynamic link in the modern sense should match two features: 1. Dynamic loading, that is, when this running module is being mapped into the virtual memory of the running module when it is required. In space, such as a module to use the Myget function in MyLib.so during operation, and before you call other functions in the MYLIB.SO module (that is Memory mapping), these contents are implemented in the kernel, which is used by page abnormal mechanisms (I may mention this issue in another article). 2, dynamic analysis, is when the function to be called is called, will then resolve this function in the start address of the virtual memory space, and then write to the storage address specifically in the call module, such as the front. Said that you have called Myget, so the MYLIB.SO module must have been mapped into the program virtual memory, and if you call MYLIB.SO's Myput function, then its function address will be called Parse it. (Note: The program used here is the general process process, and the module may be the binary code of your program, or it may be another shared link file that is dependent on your program ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- ELF format.) It is very a bit like the operation of memory in the current operating system, that is, only virtual space mapping is performed when you want to use a memory space, not premature to put all The spatial map is good, and only physical space is allocated when you want to read from this memory space. This is a bit like the first. There is only a COW (Copy ON WRITE) only when writing this memory space. This is a bit like the second. This kind of benefit is to fully avoid unnecessary overhead. Because any program is running, it is impossible to use all call functions. Such ideological methods are proposed and implemented in the system of SunOS's SunOS in the 1980s. For this history, please see the information [1]. ELF binary format files and modern dynamic link ideas are roughly formed at the same period, and its source is A.out two in AT & T's earliest in the file format. Bell Labs staff In order to adapt this new software and operating system requirements (such as AIX, SunOS, HP-UX, the UNIX variant of the AIX, SunOS, HP-UX, the extension requirements of the broader application For support for face-to-object, the ELF file format is invented. I don't discuss the specific details of the ELF file here. This can be written in a very long article. You can get information [2] to get Abi (Application Binary Interface specification). However, the hierarchical management method used in the ELF file did not only play an important role in dynamic links, but this idea can be said to be the oldest and most classic ideas in our computer. For each ELF file, there is an Elf Header, and each header here has two data members, which is ELF32_OFF E_PHOFF; ELF32_OFF E_SHOFF; they represent the offset of Program Header and Section Header in the ELF file. Program Header is a primary, and Section Header is the first small purpose. ELF32_ADDR SH_ADDR; ELF32_OFF SH_OFFSET; SH_ADDR This section's mapping address in memory (for dynamic link library, this is a relative amount, which forms an absolute address with L_ADDR loaded with the entire ELF file). SH_OFSET is the offset of this section header in the file.

This is the case, it is like this, it is to use Elf Header to manage the entire ELF file: for an example, if you want to find a corresponding function start address according to the known function name from an ELF dynamic link library file Then, the process is like this. First, find the offset E_phoff of the file from the previous ELF, in this, find the PHDR of Pt_Dynamic's D_TAG, find the DT_Dynamic section from this address, and finally find such an ELF32_SYM structure, its ST_NAME The string of refers to consistent with a given name, and use ST_VALUE. This management mode can be said to be very complicated, sometimes it looks cumbersome. If you find a Function start address, you need four steps from Elf Header >> Program HEADER >> Symbol Section >> Function AddRess. But the fundamental reason here is that our computer is linearly addressed, and Feng * Nobiman is related to the computer architecture, so it is an old idea. But it is also due to such an ELF file structure, which is very conducive to the expansion of the ELF file. We can imagine that if one day, our ELF file is encrypted for some reason. At this time, if you want to save the key in the ELF file, you can open a special section encrypt in the ELF file. This section is ST_ENCrypt, that isn't it okay? This can be seen that the ELF file format designer is the first pain (now this is really such a section). Second, the code is speaking so much, and there is no truly speaking on the loading and call of the Linux dynamic link library under the Intel 32 platform. In general, the program we have written is done by the compiler with ld.so this dynamic link library. And if you want to explicitly call the program in a dynamic link library, the following is an example.

#include #include main () {void * libc; void (* printf_call) (); char * error_text; if (libc = DLOPEN ("/ lib / libc.so.ly , RTLD_LAZY)) {Printf_Call = DLSYM (Libc, "Printf"); (* Printf_call) ("Hello, World / N"); DLClose (libc); Return 0;} error_text = DLERROR (); printf (Error_Test); Return-2;} In this first, use DLOpen to open a dynamic link library file, and this process is much more than what we see here, I will use a lot of space below to illustrate this, and it returns The parameter is a pointer, which is Struct Link_Map *, and DLSYM is the address of this function in this process together in this struct link_map * with the function name, this process is the function parsing. The last DLCLOSE is released from the resources you just got in DLOPEN. This process is probably the same as our Loaded Share Object File Module, the program in the kernel, but here is in the user state, and that is in the kernel state. The complexity of the function is more complicated (finally a point to explain, if you want to compile the above file ------- File name If you are Test, you can't use the general GCC -O Test Test. c, but should be GCC -C Test Test.c-LDL to compile, because the compiler can not find DLOPEN and DLSYM DLCLOSE these special functions libdl.so.2, -ldl is loading it Sign). Third, the _dl_open loading process analysis This article and the subsequent articles will be explained in the above procedure. That is, in the way of DLOPEN >> DLSYM >> DLClose, there is a few points to explain: I am here, the source code is from the GLIBC version 2.3.2. However, due to the original code, from the transplantation and robust consideration, there are many prevention errors, with the code about the different platforms, most of which is the error handling code, I will delete these code. And only by the code under Intel 32 platform. Also, here is also taken into account the dynamic link library load in multithreading, this is not included here (not supported in the current Linux kernel). So the code you see, in power to ensure that the dynamic link loading and function resolution have made most deletions, the code amount is only about one quarter, while the original code is maintained, highlight core function. Despite this, there is still a code of up to 2000 rows, please understand patiently. I will also make a detailed description of the possible difficulties. Let everyone truly understand the true meaning of code design and dynamic analysis. The first function is in DL-OPEN.C

2672 void * internal_function 2673 _dl_open (const char * file, int mode, const void * caller) 2674 {2675 struct dl_open_args args; 2676 2677 __rtld_lock_lock_recursive (GL (dl_load_lock)); 2678 2679 args.file = file; 2680 args.mode = mode; 2681 args.caller = caller; 2682 args.map = NULL; 2683 2684 dl_open_worker (& args); 2685 __rtld_lock_unlock_recursive (GL (dl_load_lock)); 2686 2687} internal_function here indicates that this function is passed the parameters from the register, and it The definition is obtained in Configure.IN. # Define INTERNAL_FUNCTION __ATTRIBUTE__ ((RegpArm (3), stdcall)) The regParm is the GCC's compilation option is to pass three parameters from the register, and stdcall indicates that this function is made by calling function, and the general function is The caller is responsible for clearing, using CDECL. __rtld_lock_lock_recursive (GL (dl_load_lock)); and __rtld_lock_unlock_recursive (GL (dl_load_lock)); now has not been fully defined, at least in linux is not, but can refer linux / kmod.c in order to prevent excessive embedded in request_module Set one lock. And other content is a package. DL_OPEN_WORKER is a real dynamic link library map and constructs a struct link_map. This is an absolutely important data structure. It is defined because it is too long, I will introduce in the appendix of the end of the second article, because you can look back Understand the process of loading and parsing dynamic link library, and there is a practical explanation in the following specific function, below we segment: _dl_open () >> DL_Open_Worker () 2532 Static Void 2533 DL_Open_Worker (void * a) 2534 {...................... ..2547 args-> map = new = _dl_map_object (null, file, 0, lt_loaded, 0, mode); here is called _dl_map_object to map files to memory. The original function is to search for dynamic link library files from different paths, but also to Soname (this is the alias of the dynamic link library file in runtime), I have been deleted here.

_dl_open () >> dl_open_worker () >> _dl_map_object () 1693 struct link_map * 1694 internal_function 1695 _dl_map_object (struct link_map * loader, const char * name, int preloaded, 1696 int type, int trace_mode, int mode) 1697 {1698 int fd ; 1699 char * realname; 1700 char * name_copy; 1701 struct link_map * l; 1702 struct filebuf fb; 1703 1704 1705 / * Look for this name among those already loaded * / 1706 for (l = GL (dl_loaded);. l; L = L-> L_Next) 1707 {1708 if (! _dl_name_match_p (name, l)) ................ 1721 Return L; 1722} 1723 1724 fd = Open_Path (Name, Namelen, Preloaded, & Env_path_List, 1725 & RealName, & FB ); 1726 1727 l = _dl_new_object (name_copy, name, type, loader); 1728 1729 return _dl_map_object_from_fd (name, fd, & fb, realname, loader, type, mode); 1730 1731 1732} / * end of _dl_map_object * / here first Search in the chain of a dynamic link that has been loaded, it is a thing in 1706 and 1721. It is also very simple to think of because there may be several dynamic link libraries on an executable file. There are several dynamic link libraries that may depend on the same dynamic link file, which may have loaded such a dynamic link library, which is the case. The following Open_PATH is a key. There are several ways to be env_path_list. One is in the system environment variable, and the second is the string in the section referred to DT_Runpath (see the appendix), and more complex, It is the environment variable obtained from other dynamic link libraries to load this dynamic link library ------- These issues are not explained.

_dl_open () >> dl_open_worker () >> _dl_map_object () >> open_path () 1289 static int open_path (const char * name, size_t namelen, int preloaded, 1290 struct r_search_path_struct * sps, char ** realname, 1291 struct filebuf * fbp ) 1292 1293 {1294 struct r_search_path_elem ** dirs = sps-> dirs; 1295 char * buf; 1296 int fd = -1; 1297 const char * current_what = NULL; 1298 int any = 0; 1299 1300 buf = alloca (max_dirnamelen max_capstrlen namelen); 1301 1302 do 1303 {1304 struct r_search_path_elem * this_dir = * dirs; 1305 size_t buflen = 0; .................. 1310 struct stat64 st; 1311 1312 1313 edp = (char *) __mempcpy (buf, this_dir -> DirNamelen; 1314 for (cnt = 0; fd == -1 && cnt status [cnt] == ​​nonexisting) 1318 Continue; 1319 1320 BUFLLEN = ((char *) __mempcpy (__mempcpy EDP, CAPSTR [CNT]. STR, 1321 Capstr [CNT] .LEN), Name, Namelen - BUF); 1322 1323 1324 FD = Open_Verify (BUF, FBP); 1325 1326 1327 __xstat64 (_stat_ver, buf, & s); 1328 1329 1341} 1342 ............... The above ALOC is a function of allocating space on the stack, so you don't have to worry about the situation of memory leakage when the function is over. (Good programmer is really important The allocation of memory is familiar with the heart). 1313 is the DirName Copy of R_Search_Path_elem, while the contents of the 1320 to 1321 lines are to add the last '/' path separation number for this path, and the Capstr is based on the path separation number obtained by different operating systems and systems. This is actually a good example, because the parameters returned by __memcpy are the last byte of the dest string, so it will get a new address after each copy, if you write with Strncpy, Such method

STRNCPY (EDP, CAPSTR [CNT]. Str, Capstr [CNT] .LEN); EDP = Capstr [CNT] .len; Strncpy (EDP, Name, Namelen); EDP = Namelen; buflen = edp-buf; this is Use four sentences, and you can use it here. The following Open_Verify is the file name that opens this BUF, and the FBP is the content of 1024 bytes from this file, and checks the validity of the file. The most important thing is the Elf_Imagic check. If successful, a file descriptor greater than -1 is returned. The entire open_path completed the way to open the file. _dl_new_object is an allocated struct link_map * data structure and populates some of the most basic parameters.

_dl_open () >> dl_open_worker () >> _dl_map_object () >> _dl_new_object () 2027 struct link_map * 2028 internal_function 2029 _dl_new_object (char * realname, const char * libname, int type, 2030 struct link_map * loader) 2031 2032 {2033 struct link_map * l; 2034 int idx; 2035 size_t libname_len = strlen (libname) 1; 2036 struct link_map * new; 2037 struct libname_list * newname; 2038 2039 new = (struct link_map *) calloc (sizeof (* new) sizeof ( * newName) 2040 libname_len, 1); 2041 .................... .. 2046 2047 New-> L_Name = RealName; 2048 new-> l_type = type; 2049 new-> l_loader = loadinger; 2050 2051 new-> l_scope = new-> l_scope_mem; 2052 new-> l_scope_max = sizeof (new-> l_scope_mem) / sizeof (new-> l_scope_mem [0]); 2053 2054 IF (GL (DL_Loaded)! = null) 2055 {2056 l = GL ( DL_Loaded); 2057 While (L-> L_Next! = NULL) 2058 L = L-> L_Next; 2059 New-> L_PREV = L; 2060 / * New-> L_Next = NULL; Would Be Necessary But WE Use calloc. * / 2061 l-> l_next = new; 2062 2063 / * add the global scope. * / 2064 new-> l_scope [idx ] = & gl (dl_loaded) -> l_searchlist; 2065} 2066 else 2067 GL (DL_LOADED) = New; 2068 GL (DL_nLoaded); ............ 2080 2081 Return New; 2082 2083} Memory allocation in 2039 is a distribution of libName and NAME's data structure, is a zero use Strategy. From 2043-2053, it is assigned to member data for Struct Link_Map. From 2054-2067, the new struct link_map * is added to a single-strand, which is useful in the future, because if this is integrally manages its related dynamic link library in one execution file, it can be Single chain traversal. If the dynamic link library to be loaded is not mapped to the virtual memory space being mapped to the process, it is just ready to work, the real point begins in the _dl_map_object_from_fd ().

Because after this, each step is required for the dynamic link library to play its role in the process. This upper section is relatively long, so the segment view, _dl_open () >> dl_open_worker () >> _dl_map_object () >> _dl_map_from_fd () 1391 struct link_map * 1392 _dl_map_object_from_fd (const char * name, int fd, struct filebuf * fbp, 1393 char * realname, struct link_map * loader, int l_type, 1394 int mode) 1395 1396 {1397 1398 struct link_map * l = NULL; 1399 const ElfW (Ehdr) * header; 1400 const ElfW (Phdr) * phdr; 1401 const ElfW ( PHDR) * pH; 1402 size_t maplength; 1403 INT TYPE; 1404 STRUCT STAT64 ST; 1405 1406 __FXSTAT64 (_STAT_VER, FD, & ST); ............ 1413 for (L = GL (DL_Loaded); L; L = L-> L_next) 1414 IF (L-> L_ino == St.st_ino && L_Dev == St.st_dev) 1415 {.......... 1418 __close (fd); ................. 1422 Free (RealName); 1423 Add_name_to_Object (l, name); 1424 1425 RETURN L; 1426} You should start again, then you will find it again. If you find the already Struct Link_Map * To load the libname (comparable basis is its with ST_INO, this is The physical file is numbered in memory, and the device number ST_DEV of the file is the same, which is a comparison of the next layer, and the specific reason, you can see "Implementation of the Document Sharing from Linux Memory Management"). The reason why it takes this again, because if the process has to start opening the dynamic link library file, walking here may have to go through a long time (according to my experiment, the file opened for the first time is probably 200 milliseconds --------- The main time is the hard disk's search and reading, but this is already a long time for the computer.) So, there may be other threads have read I entered this dynamic link library, so there is no need to do it anymore. This is consistent with the idea used by the kernel's opening file.

_dl_open () >> DL_Open_Worker () >> _dl_map_object () >> _dl_map_from_fd () 1427 1428 / * this is the elf header. we read it in `Open_VERIFY '. * / 1429 Header = (void *) FBP-> BUF; 1430 1431 L-> L_ENTRY = header-> e_entry; 1432 Type = header-> e_type; 1433 l-> l_phnum = header-> e_phnum; 1434 1435 maplength = header-> e_phnum * sizeof (ELFW (PHDR)); 1436 A segment of the ELF file made in a paragraph is made to do a little preparation (to read and write the array of PHDR). _dl_open () >> dl_open_worker () >> _dl_map_object () >> _dl_map_from_fd () 1438 / * Scan the program header table, collecting its load commands. * / 1439 struct loadcmd 1440 {1441 ElfW (Addr) mapstart, mapend, dataend, Allocnd; 1442 OFF_T MAPOFF; 1443 INT Prot; 1444} LOADCMDS [L-> L_phnum], * C; 1445 Size_t nloadCMDS = 0; here the data structure is defined inside the function, ensuring that this is a local variable definition, with object-oriented object The effect of Private is the same.

_dl_open () >> DL_Open_Worker () >> _dl_map_object () >> _dl_map_from_fd () 1448 for (pH = phdr; pH <& phdr [l-> l_phnum]; pH) 1449 Switch (ph-> p_type) 1450 {... ...... 1454 case pt_dynamic: 1455 l-> l_ld = (void *) ph-> p_vaddr; 1456 l_vaddr; 1456 l_ldnum = ph-> p_memsz / sizeof (ELFW (DYN)); 1457 Break; 1458 1459 case pt_phdr: 1460 L-> l_phdr = (void *) ph-> p_vaddr; 1461 break; 1462 1463 case pt_load: .............. 1467 c = & loadingcmds [nloadcmds ]; 1468 c-> mapstart = ph-> p_vaddr & ~ Ph-> p_align - 1); 1469 c-> mapEnd = ((p_filesz gl (DL_PAGESIZE) - 1) 1470 & ~ (DL_PAGESIZE) - 1); 1471 C-> DataEnd = ph-> p_vaddr ph-> p_filesz; 1472 c-> allocend = ph-> p_vaddr ph-> p_memsz; 1473 c-> mapoff = ph-> p_offset & ~ (pH-> p_align - 1); ... ........ 1480 c-> prot = 0; 1481 if (ph-> p_flags & pf_r) 1482 c-> prot | = prot_read; 1483 if (ph-> p_flags & pf_w) 1484 c-> prot | = prot_write 1485 IF (ph-> p_flags & pf_x) 1486 C-> Prot | = prot_exec; 1488 Break; ............ 1493} In the ELF file specification, depending on the Different Program HEADER, different functions are implemented, and different processing strategies, please refer to Appendix 2 Instructions. There is no general default but actual operation with the following statement is equivalent: default: Continue; It is really a concise characteristic of the program. However, there is a special point that PT_LOAD, and all the loaded sections are built in loadCMDS in loadCMDS, it is a good idea. In particular, the pointer is used, it is worth learning (1467 c = & loadcmds [nloadcmds ];).

_dl_open () >> dl_open_worker () >> _dl_map_object () >> _dl_map_from_fd () 1498 c = loadcmds; ............ 1501 maplength = loadcmds [nloadcmds - 1] .allocend - c-> mapstart; 1502 1503 if (__builtin_expect ( TYPE, ET_DYN) == ET_DYN) 1504 {.............. 1521 l_map_start = (ELFW (addr)) __mmap ((void *) 0, Maplength, 1522 C-> Prot, Map_copy | map_file, 1523 FD , c-> mapoff); 1524 1525 l-> l_map_set = l-> l_map_start maplength; 1526 l_map_start - c-> mapstart; ........ .. 1535 __mprotect ((CADDR_T) (L -> l_addr c-> mapend, 1536 LoadCmds [NLOADCMDS - 1] .allocend - c-> mapEnd, 1537 prot_none); 1538 1539 Goto Postmap; 1540} There is a line between 1521-1526 lines. Mapping, at 1498 lines and 1501 lines, is calculated by the contents of the two PT_LOAD Program Headers with the tail. The 1503 line is our scenario here, because this is the loading of the dynamic link library. The attributes of the modification of the modification of the virtual memory of the 1535 line are the blank failure of the mapping on the highest address. This is a protection. In order to prevent some people from making an article here.

_dl_open () >> DL_Open_Worker () >> _dl_map_Object () >> _dl_map_from_fd () 1546 While (C <& loadCmds [NLOADCMDS]) 1547 {1548 1549 Postmap: 1550 if (l_phdr == 0 1551 && (Elfw) )) C-> Mapoff <= header-> e_phoff 1552 && ((size_t) (c-> mapEnd - C-> MapStart C-> Mapoff) 1553> = header-> e_phoff header-> e_phnum * sizeof (Elfw (PHDR))))) ... 1555 l_phdr = (void *) (C-> MapStart header-> e_phoff - c-> mapoff); 1556 1557 if (c-> allocend> C-> DataEnd) 1558 {........ 1561 Elfw (addr) ZERO, ZEROEND, ZEROPAGE; 1562 1563 ZERO = L-> L_Addr C-> DataEnd; 1564 ZeroEnd = L-> L_Addr C-> allocend; 1565 zeropage = ((ZERO GL (DL_PAGESIZE) - 1) 1566 & ~ (GL (DL_PAGESIZE) - 1)); 1567 1568 IF (ZeroEnd Zero) 1574 {...... 1576 IF ((C-> Prot & Prot_Write) == 0) 1577 {1578 / * DAG NAB IT. * / 1579 __MPROTECT ((CADDR_T) (ZERO & ~ GL (DL_PAGESIZE) 1580 - 1)), GL (DL_PAGESIZE), 1581 C-> Prot | Prot_Write) <0); 1582 1583} 1584 MEMSET ((void *) ZERO, '/ 0', Zeropage - ZERO); 1585 IF ((C-> Prot & Prot_Write) == 0) 1586 __mprotect ((CADDR_T) (ZERO & ~ (GL (DL_PAGESIZE) - 1), 1587 GL (DL_PAGESIZE), C-> Prot); 1588} 1589 1590 IF (ZeroEnd> Zeropage) 1591 {....... 1593 Caddr_t mapat; 1594 mapat = __mmap ((CADDR_T) Zeropage, ZeroEnd - Zeropage, 1595 C->

PROT, MAP_ANON | MAP_PRIVATE | MAP_FIXED, 1596 Anonfd, 0); 1597 1598} 1599} 1600 1601 C; 1602} The above phase is similar, according to the operation attribute of file mapping obtained from Pt_Load Program HEADER Conductive, but when zeroend> Zerorpage is different, map it into the data space exclusive to the process. This is also the place where the general initialization data area BSS is. Because Zeroend is the page of the map in the file, the zeropage is the page pair of content mapped in the file, which is to be prepared for uninitialized data, which is reflected in 1593-1597. To change its attributes to be writable, all is 0. _dl_open () >> DL_Open_Worker () >> _dl_map_object () >> _dl_map_from_fd () 1606 if (l_phdr == null) 1607 {...... 1611 ELFW (phDR) * newp = (ELFW (PHDR) *) Malloc (HEADER-> E_PHNUM 1612 * SIZEOF (ELFW (PHDR))); 1613 1614 L-> L_phdr = Memcpy (newp, phDr, 1615 (header-> e_phnum * sizeof (ELFW (PHDR)))); 1616 L > l_phdr_allocated = 1; 1617} 1618 Else 1619 / * Adjust the Pt_phdr value by the runtime load address. * / 1620 (ELFW (addr)) L-> L_phdr = l-> l_addr; put the phdr is Program HEADER Gannos Struct Among the management of LINK_MAP, the general situation will not be, so you have to come over.

_dl_open () >> DL_Open_Worker () >> _dl_map_Object () >> _dl_map_from_fd () 1625 ELF_GET_DYNAMIC_INFO (L); the function Elf_Get_Dynamic_info called here is one of the most important one in the loading process, because almost all the pairs after this Dynamic link management is used to be related to the L_Info data group here.

_dl_open () >> dl_open_worker () >> _dl_map_object () >> _dl_map_from_fd () >> elf_get_dynamic_info () 2826 static inline void __attribute__ ((unused, always_inline)) 2827 elf_get_dynamic_info (struct link_map * l) 2828 {2829 ElfW (Dyn) * Dyn = L-> L_LD; 2830 ELFW (DYN) ** info; 2831 2832 2833 info = l-> l_info; 2834 2835 While (DYN-> D_TAG! = DT_NULL) 2836 {2837 if (DYN-> D_TAG D_TAG] = DYN; .................. 2853 DYN; 2854} ............ 2858 if (l_addr! = 0) 2859 {2860 ELFW (AddR) L_ADDR = L -> l_addr; 2861 2862 if (INFO [DT_HASH]! = NULL) 2863 INFO [DT_HASH] -> D_UN.D_PTR = L_ADDR; 2864 IF (INFO [DT_PLTGOT]! = null) 2865 INFO [DT_PLTGOT] -> D_UN. D_PTR = L_ADDR; 2866 IF (INFO [DT_STRTAB]! = NULL) 2867 Info [DT_STRTAB] -> D_UN.D_PTR = L_Addr; 2868 IF (INFO [DT_SYMTAB]! = NULL) 2869 Info [dt_symtab] -> D_UN. D_ptr = l_addr; .................. 2874 ............ 2876 if (Info [DT_REL]! = null) 2877 Info [DT_REL] -> D_ Un.d_ptr = l_addr; ........... 2879 2880 if (Info [DT_JMPREL]! = null) 2881 Info [DT_JMPREL] -> D_UN.D_PTR = L_ADDR; 2882 IF (INFO [Versymidx (DT_VERSYM)]! = NULL) 2883 INFO [VERSYMIDX (DT_VERSYM)] -> D_UN.D_PTR = L_ADDR; 2884} ............ .2889} The unused in __attribute__ above is to eliminate the compiler in -wall in the case of might useless A warning is issued to the local variable in the function, and alwayse_inline, very well explained, is the mandatory sign of the inline function. 2829 lines of L-> L_LD are given 1455 in the front __dl_map_object_from_fd. That is, all the addresses regarding the dynamic link festival (see the explanation in Appendix B). It is obvious that the cycle between the 2835 to 2854 lines is to fill the contents of l_info.

This has a big role after this, because these sections can find a function name and positioning information, where the amount here is associated with D_TAG, and the code is simple. 2856 to 2885 is the adjustment process for dynamic link libraries (each section of the adjusted is important to correspond to function parsing, details can be referred to Appendix A), if we think more, in front of the function The 1521 line starts to map the entire file into the memory, which is very good here. If it is not continuous, there is no way to make a unified adjustment here. _dl_open () >> DL_Open_Worker () >> _dl_map_Object () >> _dl_map_from_fd () 1662 / * finally the file information. * / 1663 l-> l_dev = st.st_dev; 1664 l_ino = st.st_ino; 1667 return L; 1670} The last is to complete the final DL_MAP_OBJECT in the first DL_MAP_OBJECT, look back in the 1414 line search for the files that have been loaded, and you can understand the role here. Go back to DL_Open_Worker

_dl_open () >> DL_Open_Worker () 2550 / * it is already open. * / 2551 if (new-> l_searchlist.r_list! = null) 2552 {....... 2556 if ((Mode & RTLD_GLOBAL) && new-> l_global = = 0) 2557 (void) add_to_global (new); 2558 2559 / * Increment Just the reference counter of the object. * / 2560 new-> l_opencount; 2561 2562 return; 2563} This is that it has been opened, Almost returned to L_OpenCount. But why do you have to make this judgment after 2551 lines, it is related to the code below, _dl_map_Object_Deps will load L_SearchList to load.

_dl_open () >> DL_Open_Worker () 2565 / * Load That Object's dependencies. * / 2566 _dl_map_object_deps (new, null, 0, 0, mode & __rtld_dlopen); ............... 2573 l = new; 2574 while (l-> L_next) 2575 L = L-> L_Next; 2576 While (1) 2577 {2578 if (! l_relocated) 2579 {2580 _dl_relocate_object (l, l-> l_scope, lazy, 0); 2581} 2582 2583 if (L == New) 2584 Break; 2585 l = L-> L_PREV; 2586} Here _dl_map_Object_deps will populate l_searchlist.r_list, for this function with the following _dl_relocate_object, the relationship with the function is relatively large, so I am in " Loading, parsing, and instance analysis (middle) --------- function parsing and unloading article under the Intel platform. However, this Struct Link_map * that is dependent on this newly loaded dynamic link library is placed in the list of this pointer (that is, l_search_list), _ DL_RELOCATE_OBJECT is the function of the function in this dynamic link library, And here, the reason why WHILE (1) 2576 is because _dl_map_object_deps used in front will also load the dynamic link library dependenh of this dynamic link library, this will be relocated of.

_dl_open () >> DL_Open_Worker () 2592 for (i = 0; I l_searchlist.r_nlist; i) 2593 if ( new-> l_searchlist.r_list [i] -> l_opencount> 1 2594 && new -> l_searchlist.r_list [i] -> l_type == lt_loaded) 2595 {2596 struct link_map * imap = new-> l_searchlist.r_list [i]; 2597 struct r_scope_elem ** runp = imap-> l_scope; 2598 size_t cnt = 0 ; 2599 2600 While (* Runp! = Null) 2601 {............ 2605 if (* Runp == & new-> l_searchlist) 2606 Break; 2607 2608 CNT; 2609 Runp; 2610} 2611 2612 if (* Runp! = null) 2613 / * avoid duplicates. * / 2614 Continue; ............ 2642 imap-> l_scope [cnt ] = & new-> l_searchlist; 2643 imap-> l_scope [cnt] = null; 2644} This code If it is very simple from implementation, it is in l_searchlist in our newly added dynamic link library new (these are the dependent dynamic link databases loaded in front of DL_Object_DEPS) IMAP-> L_Scope find If RUNP has & new-> l_searchlist, you don't have to expand the original IMAP-> l_scope, but if you don't have the expansion of 2616 to 2644 lines. But after this background, it is & new-> l_searchlist is actually the new itself. In general, if this dependent dynamic link library is loaded before the New is loaded (specific reason, the next article is described in the Dynamic Link Library Function Analysis), which will encounter this. And we can't guarantee the occurrence of mutual dependence between the two dynamic link libraries, as shown below, the solution here is a remedy. _dl_open () >> DL_Open_Worker () 2647 _dl_init (new, __libc_argc, __libc_argv, __environ); this is the initial function to call the dynamic link library. This is a bit similar to the content of init_module called when INSMOD. As for the __libc_argc, __libc_argv, __libc_argv, __libc_argv, __environ__libc_argv, __environ is running by Bash, and the general dynamic link library is not useful.

_dl_open () >> dl_open_worker () >> _dl_init () 1118 void 1119 internal_function 1120 _dl_init (struct link_map * main_map, int argc, char ** argv, char ** env) 1121 {1122 1123 ElfW (Dyn) * preinit_array = main_map -> l_info [DT_PREINIT_ARRAY]; 1124 ElfW (Dyn) * preinit_array_size = main_map-> l_info [DT_PREINIT_ARRAYSZ]; 1125 unsigned int i; 1126 1127 1128 ElfW (Addr) * addrs; 1129 unsigned int cnt; 1130 1131 1132 addrs = (ElfW (AddR) *) (preinit_Array-> D_un.d_ptr main_map-> l_addr); 1133 for (CNT = 0; CNT l_searchlist.r_nlist; 1147 While (I -> 0) 1148 Call_init (main_map-> l_initfini [i], argc, argv, eNV); 1149 1150 1151 1152 1153} first Call the contents of dt_preinit, which is in the init method. I think this is to be achieved, not just to make the developer of the dynamic link library have a better development interface, and still perform some initialization work before it relies on the dynamic link library it depends, in view of the object-oriented constructor.

_dl_open () >> DL_Open_Worker () >> _dl_init () >> Call_init () 1072 Static Void 1073 Call_init (Struct Link_Map * L, Int Arg, Char ** Argv, Char ** ENV) 1074 {1075 1076 IF (L- > l_init_called) 1078 return; 1079 1082 l-> l_init_called = 1; .......... 1089 if (l_info [dt_init]! = null) 1090 {1091 init_t init = (init_t) DL_DT_INIT_ADDRESS (L, L-> L_addr l-> l_info [dt_init] -> d_un.d_ptr); 1092 1093 / * Call the function. * / 1094 Init (Argc, Argv, ENV); 1095} 1098 ELFW (DYN) * INIT_ARRAY = L-> L_Info [Dt_init_Array]; 1099 if (init_Array! = Null) 1100 {1101 Unsigned Int J; 1102 Unsigned Int JM; 1103 ELFW (addr) * AddRS; 1104 1105 JM = L-> L_INFO [DT_INIT_ARRAYSZ] -> D_un.d_val / sizeof (ELFW (AddR)); 1106 1107 AddRS = (ELFW (AddR) *) (init_array-> d_un.d_ptr l_addr); 1108 for (j = 0; J

However, there are several issues that are not mentioned 1, and the functions in the executable will be positioned into the functional body of the dynamic link. 2. What is the relationship between a dynamic link library and the dependency dynamic link library, how are they contact. 3, how is a function to be dynamically parsed, it is integrated with the function caller and implementation. These issues I will clarify the loading, parsing and instance analysis of the ELF file dynamic link under Linux under the Intel platform. Please look forward to it. Appendix A: Dynamic Link Section Type and Description

Type value D_UN Indication EXEC Optional DYN Option Description DT_NULL0 Does Must This means that the end flag of dynamic link section DT_NEDED1D_VAL Optional optional selection D_VAL is a string that is ending with NULL, these strings are this dynamic Link file or executable dependencies and path of the file name with the path of the path DT_PLTRELSZ2D_VAL Optional option here D_VAL is the size of the procedure link table, which combines DT_JMPREL Use DT_PLTGOT3D_PTR optional optional D_PTR It is the start address of the process link table or the global offset table. DT_HASH4D_PTR must have to have D_VAL here to be the start address of the symbol hash table. DT_STRTAB5D_PTR must have to give the start address of the symbolic name string table here. DT_SYMTAB6D_PTR must have to have D_Ptr here, the start address in the ELF32_SYM data structure in the section table. DT_STRSZ10D_VAL must have to have this D_Val is the size of the DT_STRTAB section above. DT_SYMENT11D_VAL must have to have a D_VAL here's size of each ELF32_SYM data structure in dt_symtabs, DT_INIT 12D_PTR Optional Optional Optional D_PTR is the starting address of the initial function that is called when the dynamic link library is loaded. DT_FINI13D_PTR Optional Optional D_PTR is a dynamic link library to call the start address of the deconstruction function when the deconstruction function is called. DT_REL17D_PTR must be optionally similar to the DT_RELA above, which is the start address of the ELF32_REL data structure, which is used in the Intel platform. DT_RELSZ18D_VAL must alternatively, this D_Val corresponds to the above DT_rel, indicating the size of the above section. DT_Relent19D_VAL must be selectable here's D_Val is the size of an ELF32_rel in dt_rel. DT_PLTREL20D_VAL Optional Optional D_VAL is related to the process link table, which is the value of DT_REL or DT_RELA, that is, this ELF file is DT_REL's words that D_Val is 17, and if it is dt_rela, it is 7DT_JMPREL23D_PTR Optional optional This is our most important ELF_DYN here because D_Ptr refers to the GOT (Global Object Table) global object table, which is actually an import function and a global variable address table. DT_INIT_ARRAY25D_PTR Optional Optional D_PTR is the start relative address to initialize the function jump table. DT_FINI_ARRAY26D_PTR Optional Optional D_Ptr is the start relative address of the function jump table called when you want to decompose. DT_INIT_ARRAYSZ27D_VAL Optional Optional D_VAL here shows the size of DT_INIT_ARRAY in front. DT_FINI_ARRAYSZ28D_VAL Optional Optional Optional D_VAL is the size of the DT_FINI_ARRAY in front. DT_Encoding32D_VAL or D_PTR is not specified that there is no specification. Now this section is not specified, but it is obviously prepared for future encryption. DT_PREINIT_ARRAY32D_PTR Alternatively No here D_Ptr is the starting address of the call initial function jump table before calling the main function. DT_PREINIT_ARRAYSZ33D_VAL Optional No D_VAL here is the front DT_Preinit_Array size top only lists the items we want to use here, and the ELF file specification designer also leaves it alone in different systems and platforms. The project is not listed here. Appendix B: Description of the Dynamic Link Library Program HEADER Type

转载请注明原文地址:https://www.9cbs.com/read-85194.html

New Post(0)