Linux Kernel Core Chinese Manual (4) - Process

xiaoxiao2021-03-06 61

Processes This chapter describes what the process is and how Linux creates, manages, and deletes the system in the system. The process performs tasks in the operating system. The program is an executable image that includes a series of machine code instructions and data on the disk, so it is a passive entity. The process can be seen as a computer program in execution. It is a dynamic entity that continues to change when the processor executes the machine code instruction. The commands and data of the handler, the process also includes a program counter and other CPU registers, and a stack of temporary data (such as routine parameters, returning address, and saved variables). Currently executed programs, or processes, including all current activities in the microprocessor. Linux is a multi-process operating system. The process is the separated task and has their respective rights and responsibilities. If a process crashes, it should not let another process in the system crash. Each independent process runs in its own virtual address space, in addition to other processes that cannot be affected outside of the mechanism of secure core management. It uses many system resources in the life cycle of a process. It uses the system's CPU to perform its instructions, store it and its data with the physical memory of the system. It opens and uses files in the file system, directly or indirectly using the system's physical device. Linux must track the process itself and the system resources it use to manage other processes in the process and the system fairly. If a process monitors most of the physical memory and CPU of the system, it is unfair to other processes. The most valuable resources in the system are CPUs. Usually there is only one system. Linux is a multi-process operating system. Its goal is to let the process run on every CPU of the system and make full use of the CPU. If the number of processes is more than the CPU (most of this), the remaining processes must wait until the CPU is released to run. Multi-process is a simple idea: a process has been running until it must wait, usually waiting for some system resources, and wait for resources, it can continue to run. In a single process, such as DOS, CPU is simply set to idle, so waiting time will be wasted. In a multi-process system, there are many processes in memory at the same time. When a process must wait, the operating system takes the CPU from this process and gives it to another more needed process. It is the scheduler selected the next most appropriate process. Linux uses a series of scheduling schemes to ensure fairness. Linux supports many different executable file formats, ELF is one of them, Java is another. Linux must transparently manage these files because the process uses the library of the system's shared library. 4.1 Linux Processes (Process) Linux, each process is represented by a data structure of Task_struct (Task and Processs in Linux) to manage the process in the system. The TASK vector table is an array of pointers that point to each Task_struct data structure in the system. This means that the maximum number of processes in the system is limited by the Task vector table, and the default is 512. When the new process is created, a new Task_Struct is assigned from the system memory and add it to the Task vector table. To make it easier to find, use the Current pointer to the process of currently running. See include / linux / sched.h In addition to the normal process, Linux also supports real-time processes. These processes must be quickly reacted for the external event (so "real-time"), and the scheduler must be treated with a normal user process.

Although the Task_Struct data structure is very large, it is complex, but its domain can be divided into the following function: The State process is executed according to the situation (state). The Linux process uses the following status: (here you miss Swapping, because it seems that it is not used) Running process is running (is the current process of the system) or ready to run (a CPU waiting to be arranged) Waiting process is waiting for one Events or resources. Linux distinguishes between two types of waiting processes: Interrupt and uninterruptible (Interruptible and Uninterruptible). The interrupting wait process can be interrupted by the signal, and the waiting process of the uninterruptible wait process is directly waiting for hardware conditions and cannot be interrupted by any case. The STOPPED process is stopped, usually a signal is received. The process being debug can be stopped. Zombie terminates, for some reason, the task vector table with a Task_struct data structure in the Task vector table. Just like to sound, it is a process of death. Scheduling Information Disprovisor requires this information to be used equitously determines which one in the system should be run. Each process in the Identifier system has a process identifier. The process identifier is not an index in the Task vector table, but is just a number. Each process also has an identifier of user and groups. Used to control the access to files and equipment in the system. Inter-Process Communication Linux supports traditional UNIX-IPC mechanisms, named signals, pipes, and signal lights (Semaphores), also supports the IPC mechanism of system V, namely shared memory, signal lights, and message queues. The IPC mechanism supported by Linux is described in Chapter 5. Links In the Linux system, no process is completely unrelated to other processes. Every process in the system has a parent process in addition to the initial process. The new process is not created, but a copy, or from a CLONED. Task_struct in each process has a parent process and a brother process (process for the same parent process) and its child process. In the Linux system you can use the pstree command to see the family relationship of the running process. INIT (1) - - crond (98) | -emacs (387) | -GPM (146) | -netd (110) | -kernel (18) | -kflushd (2) | -klogd (87) | -Kswapd (3) | -login (160) --- Bash (192) --- Emacs (225) | -LPD (121) | -MINGETTY (162) | -MINGETTY (163) | -MINGETTY (164) --- Bash (404) --- PStree (594) | -sendmail (134) | -syslogd (78) `-update (166) All processes in the system also Store in a two-way lin list in a task_struct data structure, the root is the init process. This table allows Linux to find all the processes in the system. It requires this table to provide support for orders such as PS or KILL. Times and Timers In a process lifecycle, the core is in addition to tracking the CPU time it uses it.

Each time piece (Clock Tick), the current process in the core update Jiffies is integrated in the system and user state. Linux also supports the time intervals of the process specified. The process can use the system call to establish the timer, send a signal to yourself when the timer expires. This timer can be disposable, or periodic. The File System process can open or close files as needed, the Task_Struct structure of the process stores each of the open file descriptors and pointers pointing to two VFS I nodes (inode). Each VFS I node uniquely describes a file or directory in a file system and provides a general interface for the underlying file system. How to support file systems under Linux is described in Chapter 9. The first I node is the root of the process (its home directory), the second is its current or PWD directory. PWD is taken from UNIX command: Prints a work directory. These two VFS nodes themselves have a number field, which increases as one or more processes. That's why you can't delete a process set as a catalog of the working directory. Virtual Memory multi-process has some virtual memory (core threads and core daemons), Linux core must know how these virtual memory is mapped to the physical memory of the system. The Processor Specific Context process can be seen as the sum of the current state of the system. As long as the process is running, it will use the processor's register, stack, and more. When a process is suspended, the context of these processes, and the CPU-related context must be saved to the Task_Struct structure of the process. When the scheduler restarts this process, its context is restored here. 4.2 Identifiers Linux, like all UNIX, using user and group identifiers to check access to files and images in the system. All files in the Linux system have all rights and licenses, which describes what the system has such permissions for the file or directory. Basic permissions are read, write and execute, and assign 3 groups of users: file owners, which belong to the process of specific groups and other processes in the system. Each group of users can have different permissions, such as a file that allows it to read and write, its group read, and other processes in the system cannot access. Linux uses groups to give a group of users to files or directory permissions, rather than give privileges for individual users or processes in the system. For example, you can create a group for all users in a software project so that only they can read and write the source code of the project. One process can belong to several groups (default is 32), which are placed in the Groups vector table in the Task_struct structure of each process. As long as one of the group belongs to a file has access to access rights, this process is another group permissions for this file. There are 4 pairs of processes and group identifiers in a Task_Struct of a process. UID, GID This user's identifier and group identifier Effective Uid and gid Some programs change the UID and GID of the execution process to their own (in the properties of the VFS i node). These programs are called a setuid program. This approach is useful because it can limit access to the service, especially those running, such as network daemon, such as network daemon. Effective UID and GID come from the SetUID program, and the UID and GID are still the original. Check if the core check privileges checks effective UID and GID. FILE SYSTEM UID AND GID is usually equal to valid UID and GID to check access to file systems. File system installed by NFS.

At this time, the user-state NFS server needs to access the file like a special process. Only file system UID and GID changes (rather than valid UID and GID). This avoids sending a KILL signal to NFS's service programs. Kill is sent to the process with a special valid UID and GID. Saved Uid and Gid This is the requirements of the POSIX standard, allowing the program to change the UID and GID of the process through the system call. The real UID and GID are stored after the original UID and GID changes. 4.3 Scheduling All Process Sections runs with the user state, partially running on the system state. The bottom hardware supports these states different but usually there is a security mechanism to transfer from the user state to the system state and turn back. The user state is much lower than the system state. Each process executes a system call, which is switched from the user state to the system state and continue. At this time, let the core execute this process. In Linux, the process is not a process that is running with each other, and they can't stop running other processes and then execute themselves. Each process will give up the CPU when it must wait for some system events. For example, a process may have to wait for a character from a file. This waiting happens in system calls in system state. The process uses the library function to open and read the file, the library function is executed in the system call from the open file. At this time, the waiting process will be suspended, and the other more worthwhile process will be selected. The process often calls system calls, so they often need to wait. Even if the process is executed, it is possible to use uneven CPU events, so Linux uses the predecessor scheduling. With this scenario, each process allows for a small amount of time, 200 milliseconds, when this time passes, select another process to run, the original process is waiting for some time until it runs again. This time period is called a time slice. Requires the scheduler to select all the most worthwhile processes in the system. A process that can be run is a process that is only waiting for the CPU. Linux uses a reasonable and simple priority-based scheduling algorithm to select in the current process of the system. When it selects a new process that is ready to run, it saves the status of the current process, and the processor related register and other context information to be saved to the Task_Struct data structure of the process. Then restore the status of the new process you want to run (and the processor is related to the processor), hand over the system to this process. In order to allocate the CPU time between all the runnable processes, the scheduler saves information in the task_struct structure of each process: see the scheduling policy of the kernel / sched.c schedule () policy process. Linux has two types of processes: ordinary and real-time. Real-time processes are higher than all other processes. If there is a real-time process ready to run, then it is always run. There are two strategies in the real-time process: Round Robin and First In FirstOx. Under the scheduling strategy of the ring, each real-time process runs in turn, and under the first-first strategy, each of the processes can run in the order in the scheduling queue, this order will not change. Priority process scheduling priority. It is also the amount of time it can be used when it is running. You can change the priority of a process via system call or renice command. RT_Priority Linux supports real-time processes. These processes have a higher priority than other non-real-time processes in the system. This domain allows the scheduler to give each real-time process a relative priority. The priority of the real-time process can be used to modify the amount of time that COUTNER can run when the process can run.

When the process starts, it is equal to priority, and each clock cycle is decremented. The scheduler runs from multiple parts of the core. It can run after the current process is placed after waiting for the queue, or it can run before the system is returned from the system state. Another reason that needs to run the scheduler is that the system clock is just set to 0 of the current process's counter (COUNTER). Each dispatch program runs it to do the following: See the kernel / sched.c schedule () KERNEL WORK scheduler runs the Bottom Half Handler and handles the system's scheduling task queue. These lightweight core threads In Chapter 11, the current poibility must handle the current process before selecting another process. If the current process's scheduling policy is the end of the run queue. If the task is interrupting and it receives a signal when it is scheduled, its state changes to Running If the current process timeout, its state is running if the status of the current process is Running or Interruptible The process is removed from the run queue. This means that such a process will not be considered when the scheduler finds the most prominent process. The Process Selection Scheduler views the process in the run queue to find the most worthy process. If there is a real-time process (with real-time scheduling policies), it will be moreweight than the ordinary process. The weight of the ordinary process is its counter, but for the real-time process is Counter plus 1000. This means that if there is a running real-time process in the system, it is always running before any ordinary running process. The current process, becauses with some time films (its counter reduction), if the system is in a unfavorable location by other equivalent priority: this is also. If several processes are the same priority, the one closest to the front section of the run queue is selected. The current process is placed behind the run queue. If a balanced system has a large number of processes that are the same priority, then perform these processes in order. This is called a ring scheduled strategy. However, because the process needs to wait for resources, their running order may change. SWAP Processes If the most worthy process is not the current process, the current process must be hang, running a new process. When a process is running, it uses the registers and physical memory of the CPU and the system. Each time it calls the routines through registers or stack pass parameters, saving values such as the return address of the call routine. Therefore, it runs when the scheduler is running. It may be privileged model: core state, but it is still the process of currently running. When this process is to hang, its machine status, including program counters (PCs), and all processor registers, must save the Task_Struct data structure of the process. Then, all machine status of the new process must be loaded. This operation is dependent on the system, and different CPUs will not be fully implemented, but often through some hardware. The context of swaping out the process occurs at the end of the schedule. The previous process stores the context, which is the snapshot of the system's hardware context when this process is over. Similarly, when the context of the new process is loaded, it is still a snapshot at the end of the schedule, including the content of the program counter and register of the process. If the previous process or the new current process uses virtual memory, the system's page table needs to be updated. Similarly, this action is suitable for the architecture. Alpha AXP processor, using TLT (Translation Look-Aside Table) or cached page entry, you must clear the page table entry belonging to the previous process.

4.3.1 Scheduling In Multiprocessor Systems (Scheduling in Multi-Processor System) In the Linux world, the multi-CPU system is less, but a large number of work has been made to make Linux a SMP (symmetric multiprocess) operating system. This is, the ability to balance the load between the CPUs in the system. Load balancing is not more important than in the scheduler. In a multiprocessor system, the desired case is that all processors run the process busy. Each process runs independently until its current process runs out of time, or to wait for system resources. The first thing to note in the SMP system is that there may be more than one idle process in the system. In a single processor's system, the idle process is the first task in the Task vector table. In an SMP system, each CPU has an idle process, and you may have more than one free CPU. In addition, each CPU has a current process, so the SMP system must log the current and idle processes of each processor. In an SMP system, each process's task_struct contains processor numbers currently running the processor number and processor number (Last_Processor) running. Why do you have no reason to run during a different CPU every time you have selected it, but Linux can limit the process on one or more CPUs using Processor_Mask. If the bit N is set, the process can run on the processor N. When the scheduler selects the process of running, it does not consider the process of the corresponding bit of Processor_mask does not have set. The scheduler also uses the previous process running in the current processor because the process is transferred to another processor. 4.4 FILES 4.1 shows two data structures for describing information related to the file system in each process of the system. The first FS_STRUCT includes the VFS I node of this process and its umask. Umask is the default mode of the new file creation, which can be changed through system call. See INCLUDE / Linux / Sched.h second data structure, files_struct, including information about all files currently used by the process. The program is written from the standard input, write to the standard output, and the error message is output to standard errors. These can be files, terminal input / output, or century devices, but they are assessed as files from the perspective of the program. Each file has its descriptor, and FILES_STRUCT includes a file pointing to 256 file data, each description process. The F_Mode domain describes the mode created by the file: read-only, read or write. F_pos records the location of the next read and write operation in the file. F_inode points to the I node describing the file, f_ops is a pointer to a set of routine addresses, each of which is a function for processing the file. For example, write data. This abstract interface is very powerful, so that Linux can support a large number of file types. We can see that PIPE in Linux is also implemented in this mechanism. Every time a file is opened, you will point to this new File structure using an idle file pointer in Files_Struct. There are three file descriptors that the Linux process starts. This is the standard input, standard output, and standard errors, which are inherited from the parent process that created their parents. For file access, it is required to pass or return file descriptors through standard system calls.

These descriptors are indexes in the FD vectors of the process, so the file descriptors of standard input, standard output, and standard errors are 0, 1, and 2, respectively. All access to the file is implemented using the file operation routine in the FILE data structure and its VFS I node. 4.5 Virtual Memory of the Virtual Memory process includes a variety of source execution code and data. The first is the loaded program image, such as the LS command. This command is composed of execution code and data like all execution images. The image file includes all the information required to load execution code and related program data into the virtual memory required for the process. The second, the process can be allocated (virtual) memory during processing, such as the content used to store the files it read. The newly allocated virtual memory needs to be connected to the existing virtual memory to be used. Third, the Linux process uses libraries that consists of universal code, such as file processing. Each process includes a copy of the library. There is no meaning, Linux uses a shared library, and several simultaneous runs can be shared. The code and data inside these shared libraries must be connected to the virtual address space of the process of virtual address space and other shared processes of the library. At a specific time, the process does not use all code and data included in its virtual memory. It may include code intended to be used in a particular case, such as initialization or processing specific events. It may just use a part of the routine in its shared library. If all of these codes are loaded into physical memory, it will only be wasteful. This waste and the number of processes in the system are multiplied, and the system's operating efficiency will be low. Linux is changed to use Demand Paging technology, and the virtual memory of the process is only transferred in physical memory only when the process tries to use. Therefore, Linux does not load code and data directly into the memory, and modify the page table of the process, putting these virtual area flags as existence but is not in memory. When the process is trying to access these code or data, the system hardware generates a Page Fault to pass control to Linux core processing. Therefore, for each virtual memory area of the process address, Linux needs to process these Page Faults until where it is from where it is from. Linux core needs to manage all of these virtual memory areas, and the contents of virtual memory of each process are described by a mm_struct mm_struc data structure pointed to by a Task_Structure. The mm_struct data structure of the process also includes the loaded information and pointers of the process page table. It includes a pointer to a group of VM_Area_struct data structures, each representing a virtual memory area in the process. This link table is sorted in the order of virtual memory. Figure 4.2 shows a simple process of virtual memory distribution and managing its core data structure. Because these virtual memory zone sources are different, Linux points to a set of virtual memory processing routines (vs vm_ops) through VM_Area_struct, and the interface is abstracted. All virtual memory thus processes can be handled in a consistent manner, regardless of how this memory is different. For example, there will be a generic routine that calls when the process tries to access the memory that does not exist, which is the processing of the Page Fault. When Linux creates a new virtual memory area and processing for a reference to virtual memory not in system physical memory, a list of VM_AREA_STRUCT data structural data structures are repeated. This means that it is important to find the correctness of the correct VM_Area_Struct data structure is important for the performance of the system. To accelerate access, Linux also puts the VM_AREA_STRUCT data structure into an AVL (ADELSON-VELSKII AND LANDIS tree.

The tree is arranged such that each VM_Area_Struct (or node) has a left and one right pointer to the adjacent VM_Area_Struct structure. The left pointer points to a node with a lower start virtual address, and the right pointer points to a node with a higher starting virtual address. In order to find the correct node, Linux starts from the roots of the tree, followed by the left and right pointers of each node until the correct VM_Area_Struct is found. Of course, in this tree releases no time, while inserting new VM_Area_structs requires additional processing time. When a process is assigned virtual memory, Linux does not reserve physical memory for this process. It describes this virtual memory through a new VM_Area_Struct data structure and connects to the virtual memory list of the process. When the process tries to write this new virtual memory area, the system will generate PageFault. The processor attempts to decode this virtual address, but there is no page entry to the memory, it will give up and generate a Page Fault exception, let Linux core processing. Linux Check this referenced virtual address is not a virtual address space of the process. If so, Linux creates the appropriate PTE and assigns a physical memory page for the process. Perhaps load the corresponding code or data from the file system or swap disk, then the process is reached from the instruction of the Page Fault, because this memory actually exists, can continue. 4.6 CREANG A Process (Create a process) When the system starts, it runs at the core state. At this time, there is only one process: initialization process. Like all other processes, the initial process has a set of machine status represented by stack, register, and the like. This information exists in the Task_struct data structure of the initial process when creating and running other processes in the system. At the end of the system initialization, the initial process starts a core thread (called init) and then performs an idle loop, and nothing. When there is nothing to do, the scheduler will run this idle process. This idle process is the only one that is not dynamic allocation but is static when the core is connected. In order not to confuse, it is called init_task. The init core thread or process has a process identifier 1, which is the first real process of the system. It performs some of the initialization settings of the system (such as open system control it, installing the root file system), and then executes the system initializer. Depending on your system, it may be one of / etc / init, / bin / init or / sbin / init. The init program uses / etc / inittab as a new process in the system in the script file. These new processes themselves may create new processes. For example: The getty process may create a login process when the user tries to log in. All processes in the system are future generations of init core threads. The creation of a new process is achieved by cloning the old process, or the current process of the current process. A new task is created by the system call (fork or clone), clone occurs at the core state. At the end of the system call, generate a new process, waiting for the scheduler to select it. Assign one or more physical pages for this cloning process (user and core) from the physical memory of the system. For the new Task_struct data structure. A process identifier will be created, unique in the process identifier group of the system. However, it is also possible to clone the process that retains its parent process identifier.

The new task_struct entered the Task vectors, the content of the old (current) process is copied to the cloned task_struct. See the kernel / fork.c do_fork () cloned process, Linux allows two processes to share resources rather than having different copies. Includes files, signal processing, and virtual memory. When sharing these resources, their corresponding count fields are correspondingly reduced, so that Linux does not release these resources until the two processes stopped. For example, if the cloned process is to share virtual memory, its task_struct will include a pointer to the original process, and the count domain of mm_struct is incremented, indicating the current number of processes currently. Cloning a virtual memory requirement of a process. A set of VM_Area_Struct data structures must be generated, the corresponding mm_struct data structure, and a page table of the cloned process, there is no copy process. Virtual memory. This will be difficult and time consuming tasks because some virtual memory may be in physical memory and the other may be in the exchange file. Instead, Linux uses techniques called "Copy On Write", that is, only one of two processes try to write virtual memory. Any virtual memory that is not written can even be written, can share the two parties in two processes. Read-only memory, such as executing code, can be shared. In order to implement "Copy On Write", the page table object of the writable area is read-only, and its VM_Area_Struct data structure is marked as "Copy On Write". When a process is trying to write to this virtual memory, it will generate Page Fault. At this time, Linux will make a copy of this memory and handle the page table and virtual memory data structure of two processes. Times and Timer Core Tracking Process CPU Time and Time. Each clock cycle, the core update the current process's Jiffies to represent the sum of time spending at the system and user state. In addition to these bookkeeping timers, Linux also supports interval timer (IntervalTimer) specified by the process. The process can be sent to its own signal when these timers expire during these timers. Supports three intervals: see Kernel / ITiMer.c REAL This timer uses real-time timing, when the timer expires, sent to a SIGALRM signal. Virtual This timer is only a SIGVTALARM signal that is sent to the process only when the process is running. PROFILE is timely when the process is running and the system represents the process. The SigProf signal will be sent when it expires. You can run one or all intervals, and Linux records all necessary information in the Task_Struct data structure of the process. These interval timers can be created using the system call, start, stop them, read the current value. The virtual and PROFILE timers have the same way: each clock cycle, the timer of the current process is decremented, if expired, please seek Kernel / Sched.c DO_IT_VIRTUAL (), do_it_prof () Real Time Space Matter is slightly different . Linux uses the timer mechanism to describe Chapter 11. Every process has its own Timer_List data structure, when using real-time timers, use the system's Timer Table. When it expires, the timer semi-part processing removes it from the queue and calls the interval timer handler.

It generates a sigalRM signal and restarts the interval timer, and then adds it to the system timer queue. See: kernel / orem.c it_real_fn () Executing Programs (executable) In Linux, like UNIX, programs and commands are typically executed by a command interpreter. The command interpreter is the same as the other process, called shell (imagine a nut, using the core as a middle edible part, while the shell surrounds it, providing an interface). There are many shells in Linux, the most commonly used SH, Bash, and TCSH. In addition to some internal commands, such as CD and PWD, commands are executable binary files. For each command input, the shell looks up the name of the match in the directory of the current process search path (placed in the PATH environment variable). If you find a file, load it and run. The shell cloned itself with the above-mentioned Fork mechanism and replaced it in the sub-process to replace the binary image (shell) it is performing. Usually the shell waits for the end, or the child process exits. You can send a sigstop signal to the child process by entering Control-z, stopping the sub-process and puts it in the background, let Shell run. You can use the shell command BG to let the shell send a sigcont signal to the child process, put the child process in the background and re-run, it will continue to run until it ends or needs to be input or output from the terminal. Executive files can be or even a script file from a number of formats. The script file must be identified and run with a suitable interpreter. For example / bin / sh explains the shell script. The executable target file includes executing code and data and sufficient information, and the operating system can load them into memory and execute. The most common target file type in Linux is ELF, which in theory, Linux flexibly is enough to handle almost all target file formats. As in the file system, Linux can support the binary format is also directly established in the core when the core is connected, or can be loaded as a module. The core saves the list of supported binary formats (see Figure 4.3), when trying to execute a file, each binary format is tried until you can work. Typically, Linux supported binaries are A.out and ELF. The executable does not need to fully read memory, and technology called Demand Loading is used. When the process uses a portion of the execution image, it is transferred to memory, and the image that is not used can be discarded from memory. See FS / EXEC.C DO_EXECVE () Elf Elf (Executable and Linkable Format) The target file is designed by UNIX system laboratory, and now it is the most commonly used format for Linux. Although there is a performance slight spending with other target files such as ECOFF and A.out, ELF feels more flexible. ELF executables include executable code (sometimes called Text) and data (data). The table in the execution image describes how the program puts the virtual memory in the process. The statically connected image is created with the connection program (LD) or the connection editor, including all the code and data required to run the image. This image also describes the address of the layout in the memory and the address to be executed in the image in the image.

Figure 4.4 Like the layout of an ELF-performable image of a static connection. This is a simple C program that prints "Hello World" and then exits. The header file describes that it is an ELF image, there are two physical heads (E_PHNUM 2), starting from the beginning of the image file (E_PHOFF). The first physical head describes the execution code in the image, in the virtual address 0x8048000, with 65532 bytes. Because it is static, all of the library code of the "Hello World" is included in the "Hello World". The entrance of the image, that is, the first instruction of the program, is not at the beginning of the image, and in the virtual address 0x8048090 (E_ENTRY). The code is immediately starting behind the second physical head. This physical head describes the data of the program and will be loaded into the virtual memory address 0x8059bb8. This data can be read or read. You will notice the size of the data in the file is 2200 bytes (p_filesz) and the size in memory is 4248 bytes. Since the previous 2200 bytes include pre-initialized data, the next 2048 bytes include data initialized by the code. See INCLUDE / Linux / Elf.h When Linux loads the ELF executable image to the virtual address space of the process, it is not an actual loading image. It sets the virtual memory data structure, that is, the VM_Area_Struct of the process and its page table. When the program executes Page Fault, the code and data of the program will be placed in physical memory. Unused program parts will not be placed in memory. Once the ELF binary format loader meets the condition, the image is a valid ELF executable image, which clears the current executable image of the process from its virtual memory. Because this process is a cloned image (all processes are all), the old image is an image of the program executed by the parent process (such as a command interpreter shell bash). Clear the old executable image will discard the data structure of the old virtual memory, reset the page table of the process. It also clears other signal processing programs set, close the open file. At the end of the clear process, the process is ready to run a new executable image. Regardless of the format of the executable image, the same information is set in the mm_struct of the process. Includes a pointer to the code and data start of the image. These values are read from the physical headers of the ELF executable image, which are also mapped to the virtual address space of the process. This also happens when the process of VM_Area_struct data structure is established and page table modified. The MM_STRUCT data structure also includes a pointer, pointing to the environment variables passing to the parameters and processes of the program. Elf Shared Libraries (ELF shared library) dynamically connected images, in turn, does not include all of the code and data required to run. Some of them are placed in the shared library and connected to the image when running. When the running dynamic library is connected to the image, the Dynamic Linker also uses the table of the ELF sharing library. Linux uses several dynamic connectors, ld.so.1, libc.so.1 and ld-linux.so.1, all in / lib directory. These libraries include universal code, such as language subroutines. If there is no dynamic connection, all programs must have these library independent copies that require more disk space and virtual memory. In the event of a dynamic connection, the table of the ELF image includes information referenced all library routines. This information indicates how dynamically connects how to locate the library routine and how to connect to the address space of the program. The Scripts Files script file is an executable that requires an interpreter to run.

转载请注明原文地址:https://www.9cbs.com/read-51952.html

9cbs

New Post(0)