Linux core (2)
Www.ibmtc.pku.edu.cn/crs/kernel/kernel.htm
(This article allows this article to be used in academic purposes.)
Eleventh chapter
Process and inter-process communication mechanism
The program is a file saved on the disk, which contains the computer's execution instructions and data, and the process can be seen as a program in the run. The program is static, and the process is dynamic. And the process links not only process instructions and data, but there are current instruction pointers, all CPU registers, and stacks for saving temporary data, all of which changes with the execution of program instructions.
During the operation, you should use many computer resources, such as CPUs, memory, files, and more. Linux is a multi-tasking operating system, and there may be multiple processes to use the same resource, so the operating system is to track all processes and their system resources to manage processes and resources.
Linux is a multi-tasking operating system that ensures that the CPU is always in use, if a running process waits for the external device to complete the job (for example, waiting for the printer to complete the print job), then the operating system can choose other processes to run Thus maintain the maximum utilization of the CPU. This is the basic idea of multitasking, and the switching between the processes is done by the scheduler.
Each process in Linux has its own virtual address space, one of the most important basic management purposes of the operating system, is to avoid the mutual impact between processes. However, sometimes users also want to use two or more functions to complete the same task. To this end, Linux provides a number of mechanisms, using these mechanisms, communication between processes and joints a task, this mechanism is called " Inter-process Communication (IPC) ". Signals and pipes are common two IPC mechanisms, but Linux also provides other IPC mechanisms.
This chapter mainly describes the management, scheduling of Linux processes, and inter-process communication mechanisms supported by Linux systems.
11.1 Linux process and thread
The Linux kernel uses a data structure (task_struct) represents a process, and the data structure pointer representing the process forms a Task array (Linux, tasks, and processes are two identical terms), which sometimes becomes a pointer vector. The size of this array is defraud that 512 indicates that the process that can run at the same time in the Linux system can have up to 512. When establishing a new process, Linux assigns a task_struct structure for the new process and saves the pointer in the Task array. The Task_Struct structure contains many fields, according to field functions, can be divided into the following categories:
Identification Number. The system identifies a process through the process identification number, but the license number is not the index number of the Task_Struct structure pointer corresponding to the process. In addition, a process also has its own user and group identification, the system judges the access to files or devices through this two identification number.
status information. A Linux process can have several states: operation, waiting, stop and zombie.
Scheduling information. The scheduler utilizes the switch between the information completion process.
Information about inter-process communication. The system uses this information to implement communication between the processes.
Process chain information. In the Linux system, any process has a parent process except for the initialization process. Each process is "clone" from the parent process. The process chain contains the parent process pointer for the process, and the process of the process of the brother process pointer and the child process pointer for the process. In addition, Linux uses a two-way linked list to record all the processes in the system, the root of this two-way linked list is the init process. With this list of information, the kernel can easily find a process.
Time and timer. The system saves the establishment time of the process in these fields, as well as the CPU time spent in its life cycle, both of which are jiffies. This time consists of two parts, one is the time spent in user mode, and the second is the time in the system mode. Linux also supports the process-related timer, the application can establish the timer through the system call, and the operating system sends a SIGALRM signal to the process when the timer expires. File system information. The process can open files in the file system, and the system needs to track these files. The system uses this type of field record information to open file descriptor information. In addition, it also includes a pointer to two VFS index nodes, which are the main directory of the process and the current directory of the process. There is a reference counter in the index node. When a new process points to an index node, the reference counter of the index node increases the count. The reference count of the index node that is not referenced is 0, so the directory cannot be deleted when the file contained in a directory is running, because the reference count of this directory is greater than 0.
And process related context information. As mentioned earlier, the process can be seen as a collection of system states, and this collection changes as the process runs. The process context is the task_struct field used to save the system status. When the scheduler switches a process from the runtime to the pause state, the current process running environment, including the value of the CPU register, and stack information; when the scheduler is running again, it will be from the process Recovery the process of the process in the context information.
11.1.1 Identifier Information
Like all UNIX systems, Linux uses the user identifier and group identifier to determine the user's access license for files and directories. All files or directories in the Linux system have owners and license properties, and Linux determines that a user access to the file is determined. For a process, the system is recorded in the Task_STRUCT structure as the four pair identifier shown in Table 11-1.
Table 11-1 identifier information of the process
UID and GID
The user identification number and group identification number of the user representing the process is usually the user who performs the process.
Effective UID and GID
Some programs can change UID and GID to their private UID and GID. When the system is running such a program, it is possible to perform I / O output directly according to the privilege of the modified UID and GID judging programs, for example, if I / O output can be performed directly. The valid UID and GID of the program can be set to other users via the SetUID system call. In the VFS index node of the program image file, the valid UID and GID are described by the attribute of the index node.
File system UID and GID
These two identifiers are similar to the above identifiers, but are used to check the access license for the file system. The NFS server in the user mode uses these two identifiers as a special process access file.
Save UID and GID
If the process modifies the UID and GID of the process through the system call, the two identifiers saves the actual UID and GID.
11.1.2 Process Status Information
As mentioned earlier, the processes in Linux have four in the state, as shown in Table 11-2.
Table 11-2 Status information of the process
Operating status
This process is a process currently running; or the process is a process that is running, that is, waiting for the scheduler to assign the CPU to it.
Waiting state
The process is waiting for an event or a resource. This process is divided into two types of interrupting processes and uninterrupted processes. The interrupting wait process can be interrupted by the signal, and the non-interrupting wait process is a process that is directly waiting for the hardware status condition, and it cannot be interrupted in any case.
Stop state
The process is in a stopped state, typically stopping the signal due to the reception signal, for example, the process is in a stopped state when the debug signal is received.
Zombie
The process has been terminated, but there is still a Task_Struct structure in the TASK array. As the name suggests, the process in this state is actually a death process. 11.1.3 File Information
As shown in Figure 11-1, each process in the system has two data structures for describing the process related to files. Where fs_struct describes the pointers of the two VFS index nodes mentioned above, that is, root and pwd. In addition, this structure also contains a umask field, which is the default mode used when the process creates files, and this default mode can be modified by system call. Another structure is FILES_STRUCT, which describes all file information used by the current process. As can be seen from the figure, each process can have 256 open files at the same time, FS [0] to FS [255] is a pointer to these File structures. The descriptor of the file is actually the index number of the FS pointer array.
In the File structure, f_mode is the open mode of the file, read-only, written or read or read; F_POS is the current location of the file; f_inode points to the index node of the file in the VFS; F_oP contains the operation routine set of the file. With F_oP, different operation functions can be defined for different files, such as a function to write data to the file. Linux utilizes this abstract mechanism to implement inter-channel communication mechanism (will be described in detail later). This abstract method is very common in the Linux kernel. In this way, the specific kernel object can have a polymorphism similar to a C object.
When the Linux process starts, there are three file descriptors to be opened, which is standard input, standard output, and error output, respectively correspond to three indexes of the FS array, namely 0, 1, and 2. If the input output redirection is started, these file descriptors refer to the specified file instead of the standard terminal input / output. Whenever the process opens a file, an idle File pointer to FILES_STRUCT is used to point to the open file description structure file. Access to files is done through file operation routines and VFS index node information defined in the File structure.
11.1.4 Virtual Memory
As seen in the previous chapter, the virtual memory of the process contains all executable code and data of the process. When you run a program, the system is to assign virtual memory for process code and data according to the information in the executable image; the process may call dynamic memory or release the allocated memory, new allocation through the system calls dynamically. Virtual memory must be linked to the process existing virtual address; Linux process can use shared library code or data, so that the code and data of the shared library also need to link to the process existing virtual addresses. It is also seen in the previous chapter that the system uses the demand paging mechanism to avoid excessive use of physical memory. Because the process may access virtual memory currently not in physical memory, at this time, the operating system is loaded into the memory page by page troubleshooting the processor. To do this, the system needs to modify the process of the process to log the virtual page in physical memory, and Linux also needs to know the source of any virtual address area in the process address space, so that physical memory can be loaded.
Figure 11-1 File information of the process
Due to these reasons, Linux uses a relatively complex data structure tracking process virtual address. A pointer to the MM_STRUCT structure is included in the Task_Struct structure of the process. The mm_struct of the process contains the mounted executable image information and the page table pointer for the process. The structure also includes several pointers that point to the VM_Area_STRUCT structure, each VM_Area_struct representing a virtual address area of the process.
Figure 11-2 is a virtual memory simplified layout of a process and the corresponding process data structure. As can be seen from the figure, the system is arranged in the descending order of the virtual memory address. Each virtual memory area may be different from the source, and some may come from the image, and some may come from the shared library, and some may be dynamically allocated memory. Therefore, Linux utilizes a virtual memory processing routine (VM_OPS) to abstract processing methods for different source virtual memory. Figure 11-2 virtual memory showing
During the process of the process, Linux wants to assign a virtual address area for the process, or modify the virtual address information from the switched file, so the access time of the VM_Area_struct structure is a key factor in performance. To this end, in addition to the chain structure, Linux also uses the AVL (Adelson-Velskii and Landis tree to organize VM_Area_struct. With this tree structure, Linux can quickly locate a virtual memory address, but insert or delete nodes in the tree take more time.
When the process uses the system call dynamically allocated memory, Linux first assigns a VM_AREA_STRUCT structure and links to the virtual memory list of the process, when subsequent instructions access this memory area, because Linux has not allocated the corresponding physical memory, so the processor A page failure is generated when performing a virtual address to a physical address (see Chapter 10), when Linux processes this page failure, you can assign the actual physical memory for the new virtual memory area.
11.1.5 Time and Timer
Linux saves a pointer to the Task_struct structure of the process that is currently running, ie CURRENT. Whenever a real-time clock is generated (also known as clock tick), Linux will update the time information of the process points to the CURRENT, if the kernel currently executes tasks (eg, when the process call system is called), then the system will record time records The time spent in system mode, otherwise records the time spent in user mode.
In addition to recording the CPU time consumed to be consumed, Linux also supports the interval timer associated with the process. When the timer expires, the signal is transmitted to the process of the timer. The process can use three different types of timers to send their own signals, as shown in Table 11-3.
Table 11-3 Three different process timers
REAL
This timer is updated in real time and sends a SIGALRM signal when expired.
Virtual
This timer is only updated at runtime, and the SIGVTALRM signal is sent.
Profile
The timer is updated when the process is running, and the kernel represents the process runtime update, and sends a SIGPROF signal when it expires.
Linux is the same for the process of Virtual and Profile Timers. In each clock interrupt, the count value of the timer is reduced until the count value is 0 when the signal is transmitted.
Treatment of the Real timer will be described in Chapter 15.
11.1.6 About thread
The concept is closely related to the process concept. Threads can be seen as different execution routes of the instructions in the process. For example, in a common word handler, the main thread process user inputs, while other parallel running threads can save the user's document in the background when necessary. The basic elements associated with the process are: code, data, stack, file I / O and virtual memory information, etc., so the system has more expenses for the process of processing, especially when process schedules. Using threads can reduce system expenses by sharing these basic elements, thus also referred to as "lightweight processes". Many popular multitasking operating systems support threads.
The thread has the "user thread" and "kernel thread". The user thread refers to no kernel support and threads implemented in the user program. This thread can also be implemented in an operating system like DOS, but the scheduling of threads requires the user program to complete, this has some similar WINDOWS 3.X collaboration Multi-task. Another one requires the participation of the kernel, and the scheduling of the thread is completed by the kernel. These two models have their benefits and disadvantages. The user thread does not require additional core expenses, but when a thread is in a waiting state due to I / O, the entire process will be switched to the waiting state, the other thread is not running, and the kernel thread is not Restriction, but it takes up more system expenses. Linux supports multithreading of kernel space, readers can also download some user-level thread libraries from the Internet. Linux kernel threads and other operating systems are different, and the former is better. Most operating systems separately define threads, increasing the complexity of the kernel and scheduling procedures; and Linux defines threads as "execution context", which actually only performs the context of the process. In this way, the Linux kernel only needs to distinguish the process, only one process / thread array, and the scheduler is still the scheduler of the process. Linux's Clone system call can be used to create new threads.
11.1.7 Session and Process Group
Since Linux is a multi-user system, there are multiple processes that belong to different users in the system at the same time. Then, when the user is in a terminal (generating a sigint signal), how does the system know which process is sent to the signal, so that the process running by the user on another terminal?
The Linux kernel manages multi-user processes through maintenance sessions and process groups. As shown in Figure 11-3, each process is a member of a process group, and each process group is a member of a session. In general, when the user logs in on a terminal, a new session begins. The process group is identified by the leader in the group, the process identifier of the leading process is the group identifier of the process group. Similarly, each session also has a lead process.
The process in the same session is connected by the session's leading process and a terminal, which is the control terminal of this session. A session can only have a control terminal, and a control terminal can only control a session. The user can send a keyboard signal to the process of the session controlled by the control terminal by the control terminal.
Only one front desk process group in the same session is that the process belongs to the recent process group can obtain input from the control terminal, and other processes are background processes, which may be divided into different background process groups.
Figure 11-3 Session and Process, Process Group
11.2 process schedule
Since Linux is a single-type operating system, the process in Linux has some partial runs in user mode, while others run in kernel mode, or system mode. The change in operation mode is done through system call. System mode has a more advanced CPU privilege level, such as directly read or writes any I / O port, setting CPU key registers, and the like. The process in Linux cannot stop the currently running process, which can only wait to wait for the scheduler to select it to run the process, the process of switching operations requires a high privilege-level CPU directive, so it can only be carried out in system mode, so When the system call is performed, the scheduler has the opportunity to switch. For example, when a process has to be paused because the system calls (eg, waiting for the user to type characters), the scheduler can select other processes to run. The process often calls the system call during the run, so the scheduler selection opportunity or more opportunities for other processes. However, the process sometimes spends a lot of CPU time in user mode, but does not invoke system calls. In this case, if the scheduler can only wait for the process to switch when the process is switched, the CPU is allocated. Some are not very fair, more extremely, when a process enters the dead cycle, the system cannot respond to the user. To this end, Linux uses a pre-standard scheduling method, and each process can only run a given time period each time, 200ms in Linux. When a process runs more than 200ms, the system selects other processes running, and the original process waits for the next operational opportunity. This time is called "time slice" in the predecessor schedule. When you need to select the next run process, select the most prominent process by the scheduler. The running process is actually a process that is only waiting for the CPU resource. If a process is waiting for other resources, the process is not running process. Linux uses a relatively simple priority-based process scheduling algorithm to select new processes. When the scheduler selects a new process, it must save and save the context information related to the process-related CPU register and other relevant instructions in the task_struct structure of the current process, then recover the CPU register and context from the task_struct structure of the selected process. Information, new processes can continue to be implemented in the CPU.
For new processes, its task_struct structure is set to the initial execution context. When the scheduled process selects this new process, first restore the CPU register from the task_struct structure, the CPU instruction count register (PC) is exactly the initial execution of the process. The command address, so that the new process can run from the beginning.
In order to be able to allocate the CPU resource equally, the core is recorded in Table 11-4 in the Task_STRUCT structure.
Table 11-4 and process schedule related task_struct information
Field
description
Policy (strategy)
This is the scheduling strategy for the system's implementation. There are two types of processes: general processes and real-time processes. Real-time processes are higher than all general processes, only one real-time process can run, the scheduler will select the process. For real-time processes, there are two scheduling strategies, one called "Round Robin", and another is called "First In First Out".
Priority (priority)
This is the priority of the system for the process, and the priority of the process can be modified via the system call or the renice command. The priority is actually calculated from the process, allowing the time value (in jiffies) that allows the process to run. RT_PRIORITY (real-time priority)
This is the relative priority given by the system for real-time processes.
Counter (counter)
This is the time value of the process (in jiffies). Set to Priority when starting running, each clock is interrupted by 1.
The scheduler is running in the following cases: When the current process is waiting for a waiting queue; a system call To return to the user mode, this is because the system call ends, the current process of the current process may just be 0 . Below is the task to be completed at each time of schedule:
1. The scheduler runs the bottom semi-processing (BOTTOM HALF HANDLER) to process the task queue of the scheduler. These handles are actually some kernel threads, which will be described in Chapter 15.
2. You must handle the current process before selecting other processes. If the current process's scheduling policy is a loop, place the current process to the tail of the run queue; if the process is interrupt, the signal is received since the last dispatch, the task status is set to run; if the current process The value is 0, the task status is also running; if the current process status is running, continue to keep this status; if the process is neither in a run state, it is not interrupting, then remove the process from the run queue, this The process is not considered when the scheduler is selected when selecting the most prominent procedure.
3. The scheduler searches for the most prominent programs in the run queue. The scheduler selects the process by comparing weights. For real-time processes, its weight is Add 1000; for general processes, weight is counters. Therefore, real-time processes will always be considered a most worthy process. If the priority of the current process is consistent with other running processes, the current process has taken a time film, therefore, is always a disadvantage. If many processes are prioritized, the scheduler selection the most upward process in the run queue, which is actually the "cycle" scheduling.
4. If the most prominent process is not the current process, you need to switch the process (or switching process). The role of the process exchange is to save the current process of running the context, while restoring the process of running the new process. The specific details of the exchange are related to the CPU type, but it should be noted that the scheduling program runs in the context of the current process, in addition, the scheduler also needs to set some of the key CPU registers and refresh the hardware cache.
The Linux kernel has had the ability to run on a symmetrical multi-processing system (SMP). In multiprocessor systems, each processor runs a process busy. When the process running on a processor exhausted its time slice, or when the process is in a waiting state, the processor runs separate schedule alone to select a new process. It should be noted that each processor has its own idle process, and each processor also has its own current process. In order to track the idle processes and current processes of each processor, the TASK_STRUCT of the process contains processor numbers (Processor fields) that are running the process, and processor numbers (Last_Processor fields) last running the process. Obviously, when a process is run again, it can be run by different processors, but the expenditure required to exchange on different processors is slightly larger, for this, each process has a processor_musk field, if the nth bit of the field For 1, then the process can run on the nth process, using this field, you can limit a process to a single processor. 11.3 Creation of the process
When the system starts, the launcher runs in the kernel mode. At this time, only one process is running in the system, that is, the initial process. At the end of the system initialization, the initial process starts a kernel thread (ie init), and yourself is in an empty cycle. The scheduler will run this idle process when there is no running process in the system. Task_struct of idle processes is the only task structure that is unmistive allocation, which is allocated at the kernel compile, called init_task.
The INIT core thread / process identification number is 1, which is the first real process of the system. It is responsible for the initial system settings, such as opening the console, hovering file system, etc. Then, the init process performs the initialization program of the system, this program may be / etc / init, / bin / init or / sbin / init. The init program uses / etc / inittab as a new process in the script file, which can create a new process. For example, a Getty process can establish a login process to accept user login requests. See Chapter 16 for details on the system startup.
The new process is established by cloning the old program (current procedure). The Fork and Clone system calls can be used to create new processes. At the end of the two system calls, the kernel allocates a new Task_STRUCT structure in the physical memory of the system, and assigns a physical page for the stack to use for new processes. Linux also assigns new process identifiers for new processes. Then, the address of the new task_struct structure is saved in the Task array, while the Task_Struct structure of the old process is copied to the task_struct structure of the new process.
When the process is cloned, Linux allows two processes to share the same resources. Shared resources include files, signal processing programs, and virtual memory. When a resource is shared, the reference value of the resource increases by 1, so that only two processes are terminated, the kernel will release these resources. Figures 11-4 illustrate the files that the parent process and sub-process sharing open.
Figure 11-4 Father process and sub-process sharing open files
The system's cloning process for process virtual memory is more clever. The new VM_Area_struct structure, the new process yourself MM_STRUCT structure, and the new process page table must be ready at the beginning, but do not copy any virtual memory. If some virtual memory of the old process is in physical memory, some in exchange files, then virtual memory replication will be very difficult and time. In fact, Linux uses techniques called "Welcoming Copy", that is, copy the corresponding virtual memory only when any of the two processes is written to the virtual memory; no written any Memory pages can be shared between two processes. The code page is actually always shared. To implement the "Welcoming Copy" technology, Linux will write the page table item of the writable virtual memory page as read-only. When the process is to write data to this memory page, the processor discovers the problem on memory access control (written to a read-only page), resulting in page failure. Thus, the operating system can capture this processor that is considered to be a "illegal" write operation to complete the memory page. Finally, Linux also modifies the page table and virtual memory data structure of the two processes.
11.4 Executive Program
Similar to UNIX, programs and commands in Linux usually execute by the command interpreter, this command interpreter is called the shell. After the user enters the command, the shell searches and inputs the image name that the command matches in the directory specified by the search path (the SHELL Variable Path). If the matching image is found, the shell is responsible for loading and executing the command. Shell first uses the Fork system to create a sub-process and then use the found executable image file to overwide the shell binary image being executed by the child process.
The executable can be a binary file with different formats, or a script file for text. The executable image file contains executable code and data, and also includes the operating system to properly load the image correctly into the memory and execute. The most common executable format used by Linux is ELF and A.out, but in theory, Linux has sufficient flexibility to load an executable file in any format.
11.4.1 ELF
ELF is an English abbreviation for "can perform connectable format", which is developed by UNIX system laboratory. It is the most frequently used format in Linux, compared to other formats (such as a.out or ecoff format), and ELF is more system spending when loading memory, but more flexible. The ELF executable contains executable code and data, typically also referred to as body and data. This file contains some tables, according to the information of the kernel, the virtual memory of the kernel can be organized. In addition, the file also includes a definition of the memory layout and an instruction location executed.
We analyze the following simple programs to use the compiler to compile and connect the ELF file format:
#include
Main ()
{
Printf ("Hello World! / N");
}
As shown in FIGS. 11-5, it is the format of the source code after compiling the ELF executable. As can be seen from Figures 11-5, the beginning of the ELF executable image file is three characters 'E', 'L', and 'f', as the identifier of such files. E_ENTRY Defines the virtual address of the initial execution instruction after the program is loaded. This simple ELF image uses two "physical head" structures to define code and data, and E_PHNUM is the number of physical head information contained in this file. This example is 2. E_PHYOFF is the offset of the first physical head structure in the file, and E_PHENTSIZE is the size of the physical head structure, which is started from the file header. According to the above two information, the kernel can correctly read the information in two physical head structure.
The p_flags field of the physical head structure defines the access attribute of the corresponding code or data. The value of the first p_flags field in the figure is fp_x and fp_r, indicating that the structure is defined by the code of the program; similarly, the second physical head defines the program data, and is readable. P_offset defines the corresponding code or the offset after the physical head. P_vaddr defines the starting virtual address of the code or data. P_filesz and p_memsz define the size of the code or data in the file and the size in the memory. For our simple example, the program code begins after two physical heads, and the program data begins at the 0x68533 byte after the physical head, obviously, the program data is followed by the program code. The code size of the program is 0x68532, which is relatively large, because the connection program connects the C function Printf code to the ELF file. The file size and memory size of the program code are different, and the file size and memory size of the program data are different. This is because of the memory data, the starting 2200 bytes are pre-initialized data, and the initialization value comes from the ELF image. The 2048 bytes thereafter are initialized by the execution code. Figure 11-5 Layout of a simple ELF executable
As described in the above chapter, Linux uses the demand paging technology to load the program image. When the shell process creates a child process with the Fork system call, the child process calls the EXEC system call (actually there are multiple EXEC calls), and the EXEC system call will load the ELF image with the ELF binary format loader, when the loader inspection image After a valid ELF file, the executable image of the current process (actually the parent process or the old process) is cleared from the virtual memory, and all the signal handles are cleared and all open files (in the corresponding File structure) The f_count reference count reduction is 1, if this count is 0, the kernel is responsible for release this file object) and resets the process page table. After completing the above procedure, you only need to assign the image code and the start and termination addresses of the image according to the information in the ELF file and set the corresponding virtual address area to modify the process page table. At this time, the current process can begin to perform the instructions in the corresponding ELF image.
Unlike the static connection library, the dynamic connection library can be connected to the process virtual address at runtime. For multiple processes that use the same dynamic connection library, you only need to keep a shared library information in memory, which saves memory space. When the shared library needs to be connected to a process virtual address at runtime, Linux's dynamic connector uses the symbol table in the ELF shared library to complete the connection work, and all dynamic library routines for ELF image references are defined in the symbol table. Linux's dynamic connector is typically included in the / lib directory, typically LD.SO.1, Llibc.SO.1 and LD-Linux.so.1.
11.4.2 Script file
The script file is actually some executable commands, which are typically interpreted and executed by the specified interpreter. The common interpreters in Linux have Wish, Perl, and command shell, such as Bash.
In general, the first line of the script file is used to specify the interpretation of the script, for example:
#! / usr / bin / wish
This line content specifies the command interpreter as the script as the script. The binary loader of the script utilizes this information search interpreter, and if the specified interpreter can be found, the loader is loaded and executed as the load process of the ELF program described above. The script file name is the first command parameter passed to the interpreter, and the initial first parameter is the current second parameter, and so on. After passing the correct command parameters for the interpreter, the script can be executed by the script interpreter.
11.5 signal
The signal is one of the oldest process communication mechanisms in the UNIX system, which is mainly used to send asynchronous event signals to the process. Keyboard interrupts may generate signals, while floating-point operation overflows or memory access errors can also generate signals. Shell usually uses signals to send job control commands to sub-process. In Linux, the number of signal types and specific platforms are related because the kernel represents all signals, so the number of words is the maximum number of signal types. For 32-bit I386 platform, one word is 32 bits, so there are 32 signals, and for 64-bit Alpha Axp platforms, each word is 64 bits, so the signal can have up to 64 kinds. The most common signal definition of Linux kernel, the C language macro name and its use are shown in Table 11-5:
Table 11-5 Common signals and their use
value
C language macro
use
1
SIGHUP
End signal from the terminal
2
Sigint
Interrupt signal from keyboard (Ctrl-C)
3
Sigquit
Exit signal from keyboard (Ctrl- /)
8
SIGFPE
Floating point abnormal signal (such as floating point operation overflow)
9
Sigkill
This signal ends the process of receiving signals
14
Sigalrm
When the timer of the process expires, the signal is sent
15
Sigterm
Signal of kill commands
In one
Sigchld
Signa of the label process to stop or end
19
Sigstop
Stop execution signal from the keyboard (ctrl-z) or debugging program
The process can select a particular action taken for some signal, including:
Ignore the signal. The process can ignore the generated signal, but Sigkill and SigStop signals cannot be ignored.
Block signal. The process can choose to block certain signals.
This signal is processed by the process. The process itself can register the processing program address of the processing signal in the system, and the signal is processed by the registered handler when the signal is issued.
The default processing is performed by the kernel. The signal is processed by the default handler of the kernel. In most cases, the signal is handled by the kernel.
It should be noted that there is no mechanism in the Linux kernel to distinguish between different signals. That is, when there is a plurality of signals, the process may receive signals in any order and processed. In addition, if the process has the same signal before processing a signal, the process can only receive one signal. The cause of the above phenomenon is related to the implementation of the kernel, will be explained below.
The system uses two words to record the currently suspended signal (Signal) and the currently blocked signal (Blocked) in the Task_STRUCT structure. The pending signal refers to a signal that has not been processed. Blocked signal refers to a signal that is currently not processed. If a certain currently blocked signal is generated, the signal will remain suspended until the signal is no longer blocked. In addition to Sigkill and Sigstop signals, all signals can be blocked, and the blocking of signals can be implemented through system calls. The Task_Struct structure of each process also includes a pointer to the SigAction structure array, which actually specifies how the process processes all signals. If a routine address containing a processing signal is included in a SIGAction structure, the signal is processed by the processing routine; in turn, the default processing is performed according to a flag in the structure or by the kernel, or only ignore the signal. By system call, the process can modify the information of the SigAction structure array to specify how the process processes the signal.
The process cannot send signals to all processes in the system. In general, except for systems and superusers, ordinary processes can only send signals to processes with the same UID and GID, or in the same process group. When the signal is generated, the kernel sets the respective position in the Signal word of the process task_struct to 1, indicating that the signal is generated. The system does not process the case before the bit is already 1, so the process cannot receive the previous signal. If the process currently does not block this signal, and the process is in an interruptive waiting state, the kernel changes the state of the process to run and placed in the run queue. Thus, when the scheduler is scheduled, it is possible to select the process to run, allowing the process to process the signal. The signal sent to a process will not be processed immediately. Conversely, the signal is handled only when the process is run again. Each process exits from the system call, the kernel checks its Signal and Block fields. If any one of the unbroubled signal is issued, the kernel is processed according to the information in the SigAction structure array. The process is as follows:
1. Check the corresponding SigAction structure, if the signal is not a sigkill or SIGSTOP signal, and is ignored, the signal is not processed.
2. If the signal is processed by the default handler, the signal is processed by the kernel, otherwise the stepping will turn to step 3.
3. This signal is processed by the process, the kernel changes the calling stack frame of the current process and modifies the program count register of the process to the entrance address of the signal handler. Thereafter, the instruction will jump to the signal handler, and when returned from the signal handler, the user mode portion of the process is returned.
Linux is Posix compatible, so the process can modify the Blocked mask of the process when processing a signal. However, when the signal handler returns, the blocked value must be restored to the original mask value, which is completed by the kernel. Linux adds a call to the cleaning program in the call stack frame of the process, which can restore the original Blocked mask value. When the kernel is in processing the signal, there may be multiple signals at the same time that the user handler is handled by the user. At this time, the Linux kernel can push all signal processing program addresses into the stack frame, and when all signal processing is completed, call the cleaning program Restore the original blocked value.
11.6 pipeline
The pipe is the most commonly used IPC mechanism in Linux. When using the pipe, the output of a process can be an input to another process. This IPC mechanism is very useful when the amount of data of the input and output is particularly large. You can imagine that if there is no pipeline mechanism, you must use a file to deliver a lot of data, it will cause many spaces and time.
In Linux, by pointing two File structures to the same temporary VFS index node, the two VFS index nodes point to the same physical page. As shown in Figure 11-6.
Figure 11-6 Pipeline diagram
In Figures 11-6, each FILE data structure defines a different file operation routine address, one for writing data to the conduit, while another is used to read data from the pipe. In this way, the system call of the user program is still the usual file operation, and the kernel uses this abstract mechanism to achieve special operations of the pipeline. The pipeline write function writes data by copying bytes to physical memory pointed to by the VFS index node, and the pipe read function reads the data by replicating bytes in the physical memory. Of course, the kernel must use a certain mechanism to synchronize access to the pipeline, for this, the kernel uses the lock, waiting for the queue and signal.
When the write process is written in the pipe, it uses the standard library function, the system can find the file structure of the file according to the file descriptor passed according to the library function. The File structure specifies a function (ie, write function) address for writing operations, so the kernel calls the function to complete the write operation. The write function must first check the information in the VFS index node before writing to the memory, while satisfying the following conditions can be performed in the actual memory copy: there is enough space in memory to accommodate all the data to be written. ;
Memory is not locked by the reader.
If the above conditions are met, the write function first locks the memory, then copy data to memory from the address space of the write process. Otherwise, the write process sleeps in the waiting queue of the VFS index node, then the kernel will call the scheduler, and the scheduler will select other processes. The write process actually is in an interruptive wait state, when there is enough space in the memory to accommodate write data, or when the memory is unlocked, the read process wakes up the write process. At this time, the write process will receive the signal. After the data is written in memory, the memory is unlocked, and all sleep nodes will be awakened in the read process of the index node.
The reading process of the pipeline and the write process are similar. However, the process can return an error message immediately when there is no data or memory being locked, not blocking the process, depending on the file or pipe opening mode. Conversely, the process can sleep in the waiting queue of the index node waiting for the write process to write data. When all processes complete the pipeline operation, the index node of the pipe is discarded and the shared data page is also released.
Linux also supports another pipeline form, called nomenclature, or FIFO because this pipeline is based on the "advanced first out" principle. The type of pipeline described above is also known as "anonymous pipe". In the named pipe, the data that is first written to the pipe is the data that is first read. The anonymous pipe is a temporary object, and the FIFO is the true entity of the file system, and the pipe can be established with the mkfifo command. If the process has enough permissions, you can use FIFO. The data structure of FIFO and anonymous pipes is extremely similar, and the main difference between the two is that FIFO already exists before use, the user can turn the FIFO; and the anonymous pipe is only in operation, and thus the temporary object.
11.7 IPC mechanism of System V
It is also provided to three IPC mechanisms that first appear in UNIX System V in order to maintain compatibility with other systems. These three mechanisms are: message queue, semaphore, and shared memory. The SYSTEM V IPC mechanism mainly has the following characteristics:
If the process is to access the System V IPC object, you need to pass the unique reference identifier in the system call.
Access to the System V IPC object, must be verified by similar file access. Settings for these object access rights by the creator's creator using the system call settings.
The reference identifier of the object is used as an index of the IPC mechanism as an access object table, but requires some operations to generate an index.
In Linux, all the data structures indicating the System V IPC object contain an IPC_Perm structure, which contains user identifiers and group identifiers as object owners and creators, and object access modes and objects. Access button. The access key is used to locate the reference identifier of the System V IPC object. System supports two access keys: public and private. If the key is public, all processes in the system can find the reference identifier of the System V IPC object after passing the permission check. However, the SYSTEM V IPC object can only be referenced by the reference identifier.
Linux is similar to the implementation of these IPC mechanisms. We only introduce two: message queues and semaphors here. 11.7.1 Message Queue
One or more processes can write messages to the message queue, and one or more processes can read messages from the message queue. This process communication mechanism is usually used in the client / server model, and the client sends a request message to the server. The server reads the message and performs the corresponding request. In many micronuclear structures, the basic communication mode between the kernel and components is the message queue. For example, in the Minix operating system, the kernel, I / O tasks, server processes, and user processes are communicating through the message queue.
Linux maintains a MSGQUE linked list for all message queues in the system, and each pointer in the list points to a MSGID_DS structure, which describes a message queue. When a message queue is established, the system allocates an MSGID_DS structure from the memory and adds the pointer to the MSGQUE Link.
Figure 11-7 is a schematic diagram of the MSGID_DS structure. As can be seen from the figure, each MSGID_DS structure contains an IPC_PERM structure and a message pointer to which the queue is included, it is clear that the messages in the queue constitute a linked list. In addition, Linux is also included in the MSGID_DS structure, information about the modification time, while including two waiting queues, and is used for the write process of the queue write process and queue.
Figure 11-7 SYSTEM V IPC mechanism - Message Queue
The write operation and reading operation of the message queue is similar, and the message is written as an example. The steps are as follows:
1. When a process is to write a message, the effective UID and GID of the process first compare the access mode in IPC_Perm. If the process cannot be written, the system call returns an error, and the write operation ends.
2. If the process can write to the message queue, the message can be copied to the end of the message queue. Before you copy, you must determine if the message queue is currently full. The specific content of the message is related to the process of participating in communication.
3. If there is currently no space accommodation message in the message queue, the write process is added to the write wait queue, otherwise, the kernel allocates an MSG structure, copy the message from the process of the process to the MSG structure, then add the MSG Add MSG At the end of the queue, at this time, the system call successfully returns, and the write operation ends.
4. Call the scheduler, the scheduler selection other processes run, the write operation ends.
If a process reads a message from the message queue, the system wakes up the process in the waiting queue.
Reading operations and writing operations are similar, but the process enters the waiting state when there is no message or no message.
11.7.2 semaphore
The concept of the semaphore is first proposed by E. W. Dijkstra in 1965. The amount of signal is actually an integer, and the process is divided into two types of operations, one is called down, and the other is called UP. The result of the DOWN operation is to make the value of the semaphore 1, the result of the UP operation is to add the value of the semaphore 1. Before performing the actual operation, the process first checks the current value of the semaphore, if the current value is greater than 0, the Down operation can be executed, otherwise the process is hibernated, waiting for the UP operation of other processes, as the UP operation of other processes The value of the semaphore will be increased, so that its DOWN operation can be completed successfully. After a signal, a signal is successfully operated by a process, and other dormant changes in this semaphore may successfully complete their own operations. At this time, the system is responsible for checking if the sleep process can complete your own operation.
In order to understand the amount of semaphore, we imagine a ticket ordering system. In the initial passenger, there is generally enough ticket to meet the amount of vote. When the number of remaining tickets is 1, and a passenger needs to set two tickets now, it is unable to meet the customer's needs. At this time, Miss ticket allows this passenger to leave his phone number. If someone will refund, it can be preferred. Let this passenger ticket. If someone will be refunded, the ticket sales call notifies the above-mentioned passengers who have to set two tickets. At this time, the passenger can set a ticket. We can see the passengers as a process, and the fixed vot can be regarded as the Down operation on the semaphore, the refund can be seen as the UP operation on the semaphore, and the initial value of the semaphore is the total number of tickets, and the ticket sales is equivalent. The signal size manager of the operating system is determined by her (operating system). The passenger (process) can not complete the operation, and when the new conditions are ripe, the registered (sleep) passenger (procedure) is responsible for notification (wake-up).
In the operating system, the simplest form of the semaphore is an integer, and multiple processes can check and set the value of the signal. This check is not interrupted, also known as "atom" operation. The result of checking and setting the operation is the result of the current value of the semaphore and the result of the set value, which may be a positive value or a negative value. According to the results of the check and setting operation, the process of operation can enter the sleep state, and when other processes complete their own check and set operation, the system checks whether the previous sleep process can be completed under the conditions of the new signal value. Check and set operation. In this way, the operation of multiple processes can be coordinated by signaling.
Semicone can be used to implement so-called "key segments". The key section refers to the code segment that can only be executed in one of the code. It is also possible to solve the classic "producer-consumer" problem, "producer-consumer" problem, similar to the above-mentioned ticket issues. This problem can be described as follows:
Two processes share a common, fixed size buffer. One of the processes, that is, the producer, put information to the buffer, another process, that is, the consumer, from the buffer (this problem can be generally typically M-producers and N consumers). When the producer places information to the buffer, if the buffer is full, the producer enters the sleep, and when the consumer takes the information from the buffer, the producer can be awakened; when the consumer takes from the buffer When the information is empty, the consumer enters hibernation, and when the producer is written to the buffer, the consumer can wake up the consumer.
Linux uses the SEMID_DS structure to represent the amount of SYSTEM V IPC seminars, see Figure 11-8. Similar to the message queue, all the semaphors in the system constitute a semary linked list, each node of the list points to a SEMID_DS structure. As can be seen from Figures 11-8, SEMID_DS structure of SEM_BASE points to a semaphore array, allowing the process that allows these signal quantities to perform operations using system calls. System calls can specify multiple operations, each operation specified by three parameters: semaphore index, operational value, and operational flag. Signal quantity index is to locate the amount of semaphore in the quantity array; the operation value is a value added to the current value of the signal amount. First, Linux determines whether all operations can be successful in the following rules: the current value of the operation value and the signal amount is greater than 0, or the operation value and the current value are 0, and the operation is successful. If there is an operation in all the operations specified in the system call to be successful, Linux will hang this process. However, if the operational flag specifies that the process cannot be suspended, the system call returns and indicates that the operation on the semaphore is not successful, and the process can continue to execute. If the process is suspended, Linux must save the number of semaphores and put the current process in the waiting queue. To this end, Linux creates a SEM_QUEUE structure in the stack and populates the structure. The new SEM_QUEUE structure is added to the waiting queue of the semaphore object (using the SEM_PENDING and SEM_PENDING_LAST pointer). The current process is placed in the waiting queue of the SEM_QUEUE structure (SLEEPER) and then call the scheduler to select another process run. Figure 11-8 SYSTEM V IPC mechanism - semaplet
If all semaphore operations are successful, the current process can continue to run. Prior to this, Linux is responsible for actually applies the operation to the respective elements of the semaphore queue. At this time, Linux checks any wait or pending process to see if their semaphore operation can be successful. If the semaphore operation of these processes can be successful, Linux will remove them from the suspend queue and actually apply their operations to the semapacity queue. At the same time, Linux will wake up the sleep process so that you can run these processes while the next scheduler is running. After the new semaphore operation is applied to the quantity queue, Linux will then check the hanging queue until no operation can be successful, or there is no pending process.
And the concept of semaphore operations also has "dead lock". When a process changes the quantity of the semaphore, it has not exited the critical segment because of the crash. At this time, the process of other suspends on the semaphore will never run the operation, which is the so-called deadlock. Linux avoids this problem by maintaining a tuning chain table for a semaphore.
11.7.3 Sharing Memory
As seen in Chapter 10, the virtual address of the process can be mapped to any physical address, so that if the virtual addresses of the two processes are mapped to the same physical address, these two processes can communicate with this virtual address. However, once the memory is shared, the access synchronization of shared memory needs to be implemented by another IPC mechanism, such as a semaphore. The shared memory in Linux is accessed by the access key and checks for access. The creator of shared memory objects is responsible for controlling access rights and the public or private features of the access key. If you have sufficient permissions, you can also lock the shared memory into the physical memory.
Figure 11-9 is a structure of a total memory object in Linux. Similar to the message queue and semaphore, there is also a linker in Linux to maintain all shared memory objects.
Figure 11-9 SYSTEM V IPC Mechanism - Sharing Memory
Referring to Figures 11-9, the structural elements of shared memory objects are as follows:
SHM_SEGSZ: Shared memory size;
Times: Use the number of processes for shared memory; attachs: Describes the virtual memory area that is shared physical memory to each process.
SHM_NPAGES: The number of shared virtual memory pages;
SHM_PAGES: Page table readlist pointing to shared virtual memory pages.
When using shared memory, participating in communication is attached to the list of virtual address locations you want to share through the system calls to Attaches.
A page failure will be generated when a process first access shared virtual memory. At this time, Linux identifies the VM_Area_STRUCT structure describing the memory, which contains processing function addresses used to handle this shared virtual memory. Shared memory page fault Processing Codes lookup in the page table neck list of ShmID_Ds to see if there is a page table item that is shared virtual memory. If not, the system will assign a physical page and create a page entry. This page is added to the page table of the process while adding the SHMID_DS structure. Thereafter, when another process accesses the shared memory, the shared memory page fault handling code will use the same physical page, but only add the page table item to the page table of this process. In this way, the two processes can be communicating through the same physical page.
When a process no longer shares its virtual memory, you can remove your virtual address area from the list, and update the process page table. After the last process releases its virtual address space, the system releases the assigned physical page.
Shared memory may also be swapped into the swap space when shared virtual memory is not locked to physical memory.
11.8 socket
The socket and the above IPC mechanism are different, which can achieve inter-computer communication communication, and discussions on sockets are carried out in Chapter 14.
11.9 Related System Tools and System Call
11.9.1 System Tools
There are three main commands in Linux to see the process running in the current system. The PS command can report the process state; the PStree can print the parent child relationship between the process; TOP can be used to monitor the process of the highest CPU utilization in the system, or interactively operate the process.
The kill command is used to send a signal to the specified process. If the signal to be transmitted is not specified, the SIGTERM signal is sent, and the default processing of the signal is the operation of the termination process. If all the signal numbers supported by the system, you can use the kill -l command to get the signal list.
MKFIFO can be used to establish a nomenclature (FIFO).
11.9.2 System call
Table 11-7 Briefly lists system calls related to process and process communication related information. The meaning of each letter in the flag column can be found in Table 10-1.
Table 11-7 Related System Call
System call
Description
Sign
Alarm
Send SIGALRM signal after designation time
M C
Clone
Create sub-process
M-
Execl, Execlp, Execle, ...
Execution image
M ! C
Execve
Execution image
M C
exit
Termination process
M C
Fork
Create sub-process
M C
Fsync
Write file cache to disk
MC
FTIME
Get time zone from 1970.1.1 seconds
M! C
GETEGID
Get a valid group identifier
M C
GetEuid
Get a valid user identifier
M C
Getgid
Get actual group identifier
M C
GetitiMTER
Get the value of interval timer
MC
Getpgid
Get the group identifier of the parent process of a process
C
Getpgrp
Get the group identifier of the parent process of the current process
M C
GetPid
Get the process identifier of the current process
M C
Getppid
Get the process identifier of the parent process
M C
GetPriority
Get priorities for process / group / users
MC
GetTimeOfDay
Get time zone from 1970.1.1 seconds
MC
GetUid
Get the actual user identifier M C
IPC
Inter-process communication
-C
Kill
Send a signal to the process
M C
Killpg
Send a signal to the process group
M! C
MODIFY_LDT
Read or write partial descriptors
-
Msgctl
Message queue control
M! C
Msgget
Get a message queue identifier
M! C
Msgrcv
Receive a message
M! C
Msgsnd
send messages
M! C
Nice
Modify process priority
MC
PAUSE
The process enters the sleep, waiting for the signal
M C
Pipe
Create pipeline
M C
SemctL
Seematic control
M! C
Semget
Get the identifier of a quantity array
M! C
Semop
Operation on the number of semaphors
M! C
setgid
Set actual group identifier
M C
Setitimer
Set space spacer
MC
setpgid
Set process group identifier
M C
SetPGRP
Creating a new process group as a calling process as a lead process
M C
SetPriority
Setup process / group / user priority
MC
SetsID
Establish a new session
M C
SetRegid
Set up actual and valid group identifiers
MC
Setreuid
Set up actual and valid user identifiers
MC
SetTimeOfDay
Set the time zone from 1970.1.1 seconds
MC
SetUID
Set the actual user identifier
M C
shmat
Additional shared memory
M! C
shmctl
Shared memory control
M! C
SHMDT
Remove shared memory
M! C
shmget
Get / create shared memory
M! C
sigction
Set / get signal processor
M C
Sigblock
Blocking signal
M! C
Siggetmask
Get the signal blocking mask of the current process
! c
Signal
Set signal processor
MC
Sigpause
Use new signal blocking mask before processing the next signal
MC
sigpending
Get hangs and blocking signals
M C
SigProcmask
Set / get signal blocking mask for current processes
C
SigseMask
Set the signal blocking mask of the current process
C!
Sigsuspend
Replace Sigpause
M C
SIGVEC
Sigction
M
SSETMASK
See sigsetmask
M
SYSTEM
Execute the shell command
M! C
Time
Gets the number of seconds since 1970.1.1
M C
Times
Get the CPU time of the process
M C
vfork
See Fork
M! C
Wait
Waiting for the process to terminate
M C
WAIT3, WAIT4
Waiting for the specified process to terminate (BSD)
MC
Waitpid
Waiting for the specified process to terminate
M C
VM86
Enter the virtual 8086 mode
M-C