Linux thread process classic articles

xiaoxiao2021-03-06  52

The process is an executive activity in the computer on the computer. When you run a program, you launched a process. Obviously, the program is dead (static), the process is live (dynamic). The process can be divided into system processes and user processes. Any process for completing the various functions of the operating system is the system process, which is the operating system itself in operation; the user process is all the processes that are started by you. The process is the unit of the operating system for resource allocation. Under Windows, the process is refined to thread, which is a smaller unit that can run independently. Threaded process classic linux article http://www.douzhe.com Author: hjzgq Posted: 2003-08-06 11:54:40

About Linux processes and threads have seen a lot of articles, I think this article can say the most classic ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- --- First. Basic knowledge: Threads and processes in accordance with the definition of the textbook, the process is the smallest unit of resource management, and the thread is the minimum unit executing. In the operating system design, from the process evolve thread, the most important purpose is to better support SMP and reduce (process / thread) context switching overhead. Regardless of how the processes are required, a process requires at least one thread as its instruction executive, and the process manages resources (such as CPU, memory, file, etc.), and assigns threads to a CPU. A process can of course have multiple threads, at this time, if the process runs on the SMP machine, it can use multiple CPUs simultaneously to perform various threads to maximize parallelism; at the same time, even in single CPU On the machine, the multi-threaded model is used to design the program. If the multi-process model is used in place of the single process model, it makes the design more concise, the function is more complete, and the execution efficiency of the program is higher, for example, multiple threads response multiple inputs At this time, the functionality implemented by the multi-threaded model can actually be implemented in a multi-process model, and compared with the latter, the context switch overhead of the thread is much smaller than the process, from the semantics, while responding more Enter such a function, actually sharing all resources other than the CPU. Two significance of thread models have developed two threaded models of core-level threads and user-level threads. The standards classified is mainly due to thread dispattors in the nucleus or outside the core. The former is more conducive to the use of multiprocessor resources, while the latter is more considered for context switching overhead. In the current commercial system, both are usually used, providing both core threads to meet the needs of the SMP system, and also supporting the way to implement another set of thread mechanisms in the user state in the user state, at which time a core thread Become a schemer of multiple user state threads. As many technologies, "mixing" usually brings higher efficiency, but it also brings more difficult to achieve difficulty, for "simple" design ideas, Linux does not implement a mixed model from the beginning. However, it uses another "mixing" of another idea. In the specific implementation of the thread mechanism, threads can be implemented on the operating system kernel, or can be implemented in the nuclear exhibition, and the latter obviously requires at least the process, and the former is generally required to support the process in the nucleus. The core level thread model clearly requires the former support, and the user-level thread model does not necessarily be based on the latter. This difference is as mentioned earlier, it is a standard different from two classification methods. When the roll supports both the process and supports threads, the "multi-pair" model of the thread-process can be implemented, that is, a thread of a process is scheduled by the nuclear, while it can also be used as the scheduler of the user-level thread pool. Select the appropriate user-level thread run in its space. This is the "mixed" thread model mentioned earlier, which can meet the needs of multi-processor systems, or minimize scheduling overhead.

This majority of commercial operating systems (such as Digital Unix, Solaris, Irix) can fully implement thread models of POSIX1003.1C standards. Threads that can be divided into "one-to-one", "more pairs one" model, the former uses a core process (perhaps a lightweight process) corresponding to a thread, and the thread scheduling is equivalent to the process schedule, The core is completed, while the latter implements multi-threads extends, and the scheduling is also completed in the user state. The latter is the implementation of the simple user-level thread model mentioned above. Is synchronized or asynchronous) is all in-units, so it cannot be positioned to thread, so this implementation cannot be used for multiprocessor system, and this demand is getting bigger and bigger, so in reality , The implementation of pure user-level threads, in addition to the algorithm research purposes, almost disappeared. The Linux kernel only provides support for lightweight processes, limiting the implementation of a more efficient thread model, but Linux focuses on the scheduling overhead of the process, and it also compensates for this lack of defects to some extent. At present, the most popular thread mechanism LinuxThread is the thread-process "one-to-one" model, dispatting to the core, and a thread management mechanism including signal processing is implemented. The running mechanism of Linux-LinuxThreads is the focus of this paper. II. LINUX 2.4 Lightweight Process in the kernel Realization initial process definitions contain programs, resources, and its implementation three parts, where procedures usually refer to the code, resources typically include memory resources, IO resources, signal processing, etc. on the operating system level. And the execution of the program is often understood to perform context, including the occupation of the CPU, which is later developed as a thread. Before the thread concept, in order to reduce the overhead of the process switching, the operating system designer gradually corrects the concept of the process, gradually allowing resources to be stripped from its main body from its main body, allowing some of the processes to share some resources, such as files, signals , Data memory, even code, which develops the concept of lightweight processes. The Linux kernel has implemented a lightweight process in version 2.0.x, and the application can use a unified clone () system call interface, specifying the created lightweight process or a normal process with different parameters.

In the kernel, Clone () calls DO_FORK () after passing the parameter transfer and interpretation, this core function is also the final implementation of fork (), vFork () system call: int do_fork (unsigned long clone_flags, unsigned long stack_start, struct pt_regs * regs, unsigned long stack_size) which is taken from clone_flags macro "or" value: #define CSIGNAL 0x000000ff / * signal mask to be sent at exit * / #define CLONE_VM 0x00000100 / * set if VM shared between processes * / #define CLONE_FS 0x00000200 / * set if fs info shared between processes * / #define CLONE_FILES 0x00000400 / * set if open files shared between processes * / #define CLONE_SIGHAND 0x00000800 / * set if signal handlers and blocked signals shared * / #define CLONE_PID 0x00001000 / * set if pid shared * / #define CLONE_PTRACE 0x00002000 / * set if we want to let tracing continue on the child too * / #define CLONE_VFORK 0x00004000 / * set if the parent wants the child to wake it up on mm_release * / #define CLONE_PARENT 0x00008000 / * set if we want to have the same parent as the cloner * / #Define CLONE_THREAD 0x00010000 / * Same thread group * / #define CLONE_NEWNS 0x00020000 / * New namespace group * / #define CLONE_SIGNAL?? (CLONE_SIGHAND | CLONE_THREAD) in do_fork (), different clone_flags will lead to different behavior, for LinuxThreads, which uses (clone_vm | clone_fs | clone_files | clone_sighand) parameters to call Clone () Create "Thread", indicating shared memory, shared file system access count, shared file descriptor table, and shared signal processing. This section is for these parameters to see how Linux kernels are shared by these resources. 1.clone_vm do_fork () needs to invoke COPY_MM () to set the mm and Active_mm items in Task_struct, which correspond to the memory space associated with the process. If DO_FORK () specifies the clone_vm switch, COPY_MM () will set the mm and Active_mm in the new task_struct to the same number as the CURRENT, while increasing the number of users of the mm_struct (mm_struct :: mm_users).

That is, the lightweight process shares the memory address space with the parent process, and the picture below can be seen by the status of mm_struct in the process: 2. The ronce of the file system in which the process is located in the process of the process is recorded in the process. Directory and current directory information, DO_FORK () calls COPY_FS () Copy this structure; for a lightweight process, only the fs-> count count is added, share the same FS_STRUCT as the parent process. That is, the lightweight process does not have a separate file system related information, and any thread in the process changes the current directory, the root directory and other information will directly affect other threads. 3. Clone_files One process may open some files, using Files (Struct Files_Struct *) in the process structure Task_structure to save the Struct File information, and the do_fork () calls Copy_Files () to handle this process property. The lightweight process is shared with the parent process, and only the files-> count count is added when copy_files (). This share enables any thread to access the open file maintained by the process, and the operations to them are directly reflected in the process in the process. 4.Clone_sighand Each Linux process can define the processing of signals to the signal, and use a struct k_sigAction structure in the sig (struct signal_struct) in task_struct, saving this configuration information, and coupy_sighand () in do_fork () is responsible for copying This information; the lightweight process is not copied, but only the signal_struct :: count count is set, and the structure is shared with the parent process. That is, the child process is identical to the signal processing of the parent process, and can be changed with each other. There are many work done in do_fork (), which is not described in detail here. For SMP systems, all process Forks are all assigned to the same CPU as the parent process, and the CPU selection will be performed when the process is scheduled. Although Linux supports a lightweight process, it cannot be said that it supports the core level thread, because Linux's "thread" and "process" are actually in a scheduling level, share a process identifier space, this limit makes it impossible LINUX implements a full-sense POSIX thread mechanism, so many Linux thread library implementation can only achieve the most semantics of POSIX as much as possible, and is as close as possible. Three. LINUXTHREAD thread mechanism LinuXThreads is currently using the most widely used thread library on the Linux platform, which is responsible for development and completed in Glibc. It is implemented based on the "one-to-one" thread model of the core lightweight process, and a thread entity corresponds to a core lightweight process, and the management between the thread is implemented in the nuclear external function library. 1. Thread Description Data Structure and Implementation Limits LinuxThreads Define a struct _pthread_descr_struct data structure to describe the thread and use global array variables __pthread_handles to describe and reference the process.

In the first two __pthread_handles, LinuxThreads global system defines two threads: __ pthread_initial_thread and __pthread_manager_thread, and characterization of the parent thread __pthread_manager_thread (initially __pthread_initial_thread) with __pthread_main_thread. Struct _pthread_descr_struct is a double-ring list structure, and the linked list where __ pthread_manager_thread is only included, in fact, __ pthread_manager_thread is a special thread, LinuxThreads only uses three fields such as Errno, P_PID, P_Priority. __Pthread_main_thread The chain where the __pthread_main_thread is located together all user threads in the process. After a series of pthread_create (), the __pthread_handles array will be shown below: Figure 2 __pthread_handles Array Structure The newly created thread will first occupy one in the __pthread_handles array, then connect to the chain pointer in the data structure _PTHREAD_MAIN_THREAD is in the list of the first pointer. The use of this list will be mentioned when introducing the creation and release of the thread. LinuxThreads follows the POSIX1003.1C standard, in which the implementation of the thread library has been limited, such as the maximum number of threads, the thread private data area, and the like. In the implementation of LinuxThreads, these restrictions are basically followed, but there are certain changes, and the trend of changes is relaxed or expanded, making programming more convenient. These defined macros are mainly concentrated in sysdeps / decal_lim.h (different from different platforms in different platforms), including the following: private data for each process, POSIX definition _POSIX_THREAD_KEYS_MAX is 128, LinuxThreads use PTHREAD_KEYS_MAX, 1024; the number of allowed operations when the private data is released, LinuxThreads consistent with POSIX, as defined PTHREAD_DESTRUCTOR_ITERATIONS 4; number of threads per process, POSIX is defined as 64, LinuxThreads increased to 1024 (PTHREAD_THREADS_MAX); thread running the minimum stack Space size, POSIX is not specified, LinuxThreads use pthread_stack_min, 16384 (bytes). 2. One of the benefits of the management thread "One-to-one" model is that the scheduling of threads is completed, while others are completed in the nuclear outer thread library. In LinuxThreads, a management thread is dedicated to each process, responsible for handling thread-related management. When the process first calls pthread_create () creates a thread, you will create (__clone ()) and start the management thread.

In a process space, the management thread is communicating between the other threads through a pair of management pipes (Manager_pipe [2]) ", which is created before creating the management thread, and managing the management of the pipe after successfully launching the management thread. The end and the write end assign two global variables __pthread_manager_reader and __pthread_manager_request, each user thread is requested by the __pthread_manager_request to the management line, but the management thread itself does not directly use __pthread_manager_reader, the read end of the pipeline (Manager_Pipe) [0]) is transmitted to the management thread as one of the parameters of __clone (), and the management thread is mainly listening to the pipe reading end, and the request is reacted from the request. The process of creating a management thread is as follows: (Global Variable PTHREAD_MANAGER_REQUEST At-1) Figure 3 After the process of the management thread is initialized, the lightweight process number and the thread ID of the lightweight process number and the verified allocation and management are recorded in __pthread_manager_thread. 2 * pthread_threads_max 1 This value will not conflict with any regular user thread ID. Managing threads run as a sub-thread of the caller thread of pthread_create (), and the user thread created by pthread_create () is created by the management thread to call clone (), so it is actually a sub-thread to manage threads. (The concept of this surian thread should be understood as a political process.) __Pthread_manager () is the main loop of the management thread, and enters the While (1) loop after performing a series of initialization work. In the loop, the thread is inquiry at a timeout query (__poll ()) manages the read end of the pipe. Before processing the request, check if its parent thread (that is, create the main thread of Manager) has exited, and exit the entire process if you have exited. If there is an exit sub-thread that needs to be cleaned, call pthread_reap_children () cleanup. Then is the request in the read pipe, and perform the corresponding operation according to the request type. The specific request processing, the source code is more clear, and details will not be described here. 3. Thread Stack In LinuxThreads, the stack of the management thread and the user thread is separated, and the management thread assigns a thread_manager_stack_size by malloc () in the process heap as its own homage stack. The stack allocation method of the user thread varies depending on the architecture, mainly according to two macro definitions, one is new_separate_register_stack, this property is only used on the IA64 platform; the other is floating_stack macro, using a few platforms such as I386 At this time, the user threading stack determines the specific location by the system and provides protection. At the same time, the user can also specify the use of a user-defined stack through a thread attribute structure. Because of the boundary, there are only two stack organizations used by the i386 platform: floating_stack mode and user custom methods. In the floating_stack mode, LinuxThreads uses MMAP () to assign 8MB space from the kernel space (I386 system default maximum stack space size, if there is a RLIMIT, follow the runtime limit), use mprotect () to set it One page is a non-access area.

The function assignment of this 8M space is as follows: Figure 4 Stack structure shows a low address protected page to monitor stack overflow. For the stack specified by the user, set the thread stack top after indicating the boundary, and calculate the bottom of the stack, do not protect, correctness is guaranteed by the user. Regardless of the organization, the thread description structure is always located in the position of the stack is adjacent to the stack. 4. Thread ID and Process ID Each LinuxThreads thread has a thread ID and a process ID at the same time, where the process ID is the process number maintained by the kernel, and the thread ID is allocated and maintained by LinuxThreads. __pthread_initial_thread thread id is PTHREAD_THREADS_MAX, __ pthread_manager_thread is 2 * PTHREAD_THREADS_MAX 1, the first user thread's thread id is PTHREAD_THREADS_MAX 2, after which the thread id n-th user threads the following formula: tid = n * PTHREAD_THREADS_MAX n 1 This allocation guarantees that all threads (including exit) in the process do not have the same thread ID, and the type of thread ID is defined as unsigned long integer, but also guarantees reason. The thread ID will not be repeated during the running time. From the thread ID lookup thread data structure is done in the pthread_handle () function, it is actually just modeled by PTHREAD_THREADS_MAX, which is the index of the thread in __pthread_handles. 5. The creation of threads After pTHREAD_CREATE () sends the REQ_CREATE request to the management thread, the management thread calls pthread_handle_create () creates a new thread. Distribution stack, set the thread property, create and start the new thread with pthread_start_thread (). PTHREAD_START_THREAD () reads its own process ID number in the thread description structure, and configures schedule according to the scheduling method of the record. After everything is ready, call the true thread execution function and then call the pthread_exit () cleaning site after this function returns. 6.LinuxThreads is not enough due to restrictions on Linux kernels and difficulty, LinuxThreads is not fully POSIX compatible, which is described in its release readme. 1) Process ID problem This lack of deficiencies is the most critical, which is caused by the "one-to-one" model of LinuxThreads. The Linux kernel does not support the true thread, LinuxThreads is used to implement thread support with a lightweight process with the normal process with the same kernel scheduling view. These lightweight processes have independent process IDs that enjoy the same capacity as normal processes in process schedule, signal processing, IO. In the source code reader, the Linux kernel's clone () does not implement support for Clone_PID parameters. The processing of Clone_PID in the kernel DO_FORK () is such: if (clone_flags & clone_pid) {if (current-> pid) goto fork_out;} This code shows that the current Linux kernel is only recognized when PID is 0, CLONE_PID Parameters, in fact, only the clone_pid parameters are used when SMP initialization is initialized.

Follow POSIX definitions, all threads of the same process should share a process ID and parent process ID, which is unable to implement in the current "one-to-one" model. 2) Signal Problem Because the asynchronous signal is distributed to the roof, each thread of LinuxThreads is a process for the kernel, and there is no "thread group", so some semantics does not comply with the POSIX standard. For example, no implementation is sent to all threads in the process, and readme will explain this. If the core does not provide real-time signals, LinuxThreads will use Sigusr1 and Sigusr2 as the RESTART and CANCEL signals used inside, so that the application cannot use the two signals that are reserved for users. The released real-time signal (from _sigrtmin to _sigrtmax) after the version after Linux Kernel 2.1.60, so there is no such problem. The default action of certain signals is difficult to implement in the current system, such as SigStop and Sigcont, LinuxThreads can only hang a thread without hanging throughout the process. 3) Total number of threads Linuxthreads define the maximum number of threads of each process as 1024, but in fact, this value is also limited by the total process number of the entire system, which is because the thread is actually a core process. In Kernel 2.4.x, a new total process calculation method is used, so that the total process number is substantially limited to the size of physical memory, and the formula is in the fork_init () function of kernel / fork.c: max_threads = Mempages / (Thread_Size / Page_Size) / 8 On I386, thread_size = 2 * Page_size, Page_Size = 2 ^ 12 (4KB), Mempages = Physical Memory Size / Page_Size, Mempages = 256 * 2 ^ 20 for 256M memory / 2 ^ 12 = 256 * 2 ^ 8, at this time, the maximum number of thread is 4096. However, in order to ensure that each user (except root) has more than half of the physical memory, fork_init () Continue to specify: init_task.rlim [rlimit_nproc]. Rlim_cur = max_threads / 2; init_task.rlim [rlimit_nproc]. Rlim_max = MAX_THREADS / 2; check the number of these processes in do_fork (), so for LinuxThreads, the total number of threads is limited by these three factors. 4) Management thread problem management thread is easy to become bottleneck, this is a common problem of this structure; at the same time, management thread is responsible for the cleaning of user threads, so although most of the signals have been shielded, once manage thread death, The user thread has to be cleaned up, and the user thread does not know the status of the management thread, and the subsequent thread creation and other requests will be processed. 5) Synchronous problem Linuxthreads in LinuxThreads is largely established on the signal basis, which is always a problem with the synchronous mode of the complicated signal processing mechanism. 6) Other POSIX Compatibility Problems Linux Many system calls, follow the semantics, such as Nice, SetUID, Strlimit, etc. In the current LinuxThreads, these calls only affect the caller thread.

转载请注明原文地址:https://www.9cbs.com/read-80158.html

New Post(0)