Multi-threaded programming under Linux
Author: Yao Jifeng evaluation: the station Date: 2002-01-14 Description: Source: http://www.china-pub.com
1 Introduction Thread Technology was proposed in the 1960s, but truly applied multi-threaded to the operating system, it was in the mid-1980s, Solaris is the leader in this area. Traditional UNIX also supports the concept of threads, but only one thread is allowed in a process, so many threads means multiple processes. Now, multi-threading technology has been supported by many operating systems, including Windows / NT, of course, including Linux. Why after the concept of the process, will you introduce a thread? What are the benefits of using multiple threads? What should I choose multithreaded? We must first answer these questions. One of the reasons for using multithreades is a very "thrifty" multitasking mode compared to the process. We know, under the Linux system, start a new process must be assigned to its separate address space, build a large number of data sheets to maintain its code segment, stack segment, and data segment, this is a "expensive" multitasking Way of working. The multiple threads running in a process are used to use the same address space between each other. Sharing most of the data, starting a spatial space spent on a thread is far less than the space spent on the start of a process, and between the threads The required time is far less than the time required for the process to switch between the processes. According to statistics, it is always said that the overhead of a process is about 30 times that of a thread overhead. Of course, this data may have a big difference on a specific system. The second reason to use multithreaded is the convenient communication mechanism between threads. For different processes, they have independent data spaces, and the delivery of data can only be carried out by communication, not only the time, but it is very inconvenient. The thread is not, because the data space is shared between the threads under the same process, so the data of a thread can be used directly to other threads, which is not only fast, and convenient. Of course, the sharing of data also brings some other problems, and some variables cannot be modified by two threads at the same time. Some subroutines are declared to static data more likely to bring catastrophic strikes to multi-threaded programs. These positive It is the most needed place when writing multithreaded programs. In addition to the advantages mentioned above, the multithreading procedure is a multi-task, and the multi-threaded procedure is a multi-task, and of course the following advantages: 1) Improve the application response. This is especially meaningful to the graphical interface. When a time consumption is long, the entire system will wait for this operation. At this time, the program will not respond to the keyboard, mouse, the menu operation, and use multi-threaded technology, will take time long. Time consuming is placed in a new thread to avoid this embarrassment. 2) Make more CPU systems more effective. The operating system ensures that different threads run on different CPUs when the number of threads is not greater than the CPU. 3) Improve the program structure. A long and complicated process can be considered to be divided into multiple threads, becoming several independent or semi-independent operational parts, which will benefit and modify. Let's first try to write a simple multi-threaded program. 2 Simple multi-threaded programming Linux system multithreading follows the POSIX thread interface, called PTHREAD. Write a multi-threaded program under Linux, you need to use the header file pthread.h, and you need to use the library libpthread.a when connecting. By the way, the implementation of PTHREAD under Linux is implemented by the system call clone ().
Clone () is a system call unique to Linux. Its usage mode is similar to fork, about clone (), interested readers can see the document description. Below we show a simplest multi-threaded program example1.c. / * EXample.c * / # include #include void thread (void) {INT i; for (i = 0; i <3; i ) printf ("this is a pthread./n" );} }int main (void ) {PTHREAD_T ID; INT I, RET; RET = pthread_create (& ID, NULL, (VOID *) Thread, NULL); if (RET! = 0) {Printf ("Create Pthread Error! / N"); exit (1 );} for (i = 0; i <3; i ) printf ("this is the main process./n" ";Pthread_join (ID, Null );Return (0);} We compile this program: GCC EXAMPLE1. C -LPTHREAD -O EXAMPLE1 Run Example1, we get the following results: this is the main process.this is a pthread.this is the main process.thread.this is a pthread. Run again, We may get the following results: this is a pthread.this is the main process.this is a pthread.this is the main process.thread.this is a pthread.this is the main process. The results are different before and after, this is two Thread compete for the results of the CPU resource. In the above example, we used two functions, pthread_create, and pthread_join, and declare a PTHREAD_T type variable. PTHREAD_T defines in header file /usr/includes/bits/pthreadtypes.h: typedef unsigned long int pthread_t; it is an identifier of a thread. Function pthread_create to create a thread, its prototype: extern int pthread_create __P ((pthread_t * __ thread, __const pthread_attr_t * __ attr, void * (* __ start_routine) (void *), void * __ arg)); The first parameter is Pointer to the thread identifier, the second parameter is used to set the thread attribute, and the third parameter is the start address of the thread run function, and the last parameter is the parameter of the function. Here, our function thread does not require parameters, so the last parameter is set to an empty pointer. The second parameter we also set to an empty pointer, which will generate threads of the default attribute. We will be elaborated in the next section on thread attributes. When the thread is created, the function returns 0. If it is not 0, the creation thread fails, and the common error returns the code to eagain and EinVal. The former indicates that the system restrictions create new threads, such as excessive threads; the latter represents the thread attribute values represented by the second parameter illegal. After creating a thread, the newly created thread runs the function of the parameter three and the parameter 4, and the original thread continues to run the next line of code. The function pthread_join is used to wait for a thread to end.
The function original is: extern int pthread_join __p (pthread_t __th, void ** __ thread_return)); the first parameter is the waiting thread identifier, the second parameter is a user-defined pointer, it can be used to store the waiting The return value of the thread. This function is a thread blocking function that calls its function to wait until the end of the waiting thread, when the function returns, the resource that is waiting for a thread is retracted. There are two ways to end the thread, one is the same as the example above, the function is over, the thread that calls it is over; another way is to be implemented by function pthread_exit. Its function prototype is: extern void pthread_exit __p ((void * __ RetVal)) __ATtribute__ ((__noreturn__)); unique parameter is the return code of the function, as long as the second parameter in pthread_join is not null, this value will be passed Give Thread_Return. Finally, the thread cannot be waited by multiple threads, otherwise the first thread that receives the signal is successfully returned, and the thread of the remaining call pthread_join returns an error code ESRCH. In this section, we wrote a simple thread and mastered the most commonly used three functions pthread_create, pthread_join and pthread_exit. Below, let's understand some common properties of the thread and how to set these properties. 3 Modify the attribute of the thread In the example of the previous section, we created a thread with the pthread_create function. In this thread, we used the default parameters to set the second parameter of the function to NULL. Indeed, for most programs, using the default attribute is enough, but we still need to understand the relevant properties of the thread. The attribute structure is pthread_attr_t, which is also defined in header file /usr/include/pthread.h, people who like to chase the roots can check themselves. Attribute values cannot be set directly, you must use the relevant function to operate, the initialization function is pthread_attr_init, which must be called before the pthread_create function. Attribute objects mainly include whether it is bound, whether it is separated, stack address, stack size, priority. The default attribute is non-bound, non-separated, default 1M stack, the same level of the same level as the parent process. About thread bindings involving another concept: Light process (LWP: Light Weight Process). The light process can be understood as the kernel thread, which is located between the user layer and the system layer. The system is allocated to thread resources, and the control of thread is implemented by light processes, and a light process can control one or more threads. In the default, how many light processes are started, which threads are controlled by the system, which is called unbound. Under the binding situation, as the name suggests, the "tied" fixed "tied" is above a light process. The bound thread has a high response speed because the CPU time slogging is to face light processes, and the bound thread can be guaranteed to have a light process available when needed. The priority and scheduling level of the binding light process can make the bound thread meet the requirements such as real-time reactions.
Set the function of the thread bind state is pthread_attr_setscope, which has two parameters, the first is a pointer to the attribute structure, the second is the binding type, which has two values: pthread_scope_system (bind) and pthread_scope_process (bind) and pthread_scope_process Unbound). The following code creates a binding thread. #include pthread_attr_t attr; pthread_t tid; / * initialize the attribute values are set to the default value * / pthread_attr_init (& attr); pthread_attr_setscope (& attr, PTHREAD_SCOPE_SYSTEM); pthread_create (& tid, & attr, (void *) my_function, NULL); thread The separation state determines a thread to terminate itself. In the above example, we use the default attribute of the thread, which is a non-separated state, in which case the original thread waits for the end of the thread. Only when the pthread_join () function returns, the created thread is terminated to release the system resources you usually take. The separation thread is not like this, it is not waiting for other threads, the end of the run, the thread is terminated, and the system resources are released immediately. Programmers should choose the appropriate separation state according to their own needs. Set the function of the thread separation state is pthread_attr_setdetachState (pthread_attr_t * attr, int debachstate). The second parameter is optional as pthread_create_detached (separation thread) and pthread _create_joinable. What you should pay attention to here is that if you set a thread to be separated thread, this thread runs very quickly, it is likely to terminate before the pthread_create function returns, it may hand over the thread number and system resources after termination. Thread use, thus calling pthread_create thread gets the wrong thread number. To avoid this situation, you can take certain synchronization measures. One of the simplest methods is that the pthread_cond_timewait function can be called in the created thread, so that this thread is waiting for a while, leave enough time to return the function pthread_create. Set a wait time, is a common method in multi-threaded programming. Be careful not to use functions such as Wait (), they are sleeping the entire process and does not solve the problem of thread synchronization. Another possible property is the priority of the thread, which is stored in the structure SCHED_PARAM. Saving with functions pthread_attr_getschedparam and functions pthread_attr_setschedparam Here is a simple example.
#include #include pthread_attr_t attr; pthread_t tid; sched_param param; int newprio = 20; pthread_attr_init (& attr); pthread_attr_getschedparam (& attr, m?); param.sched_priority = newprio; pthread_attr_setschedparam (& attr, m?); pthread_create (& tid, & attr , (Void *) myfunction, myarg); 4 Thread data processing and process compared to one of the best advantages of threads, the data segment of the data, the data segments followed by each process, which can be convenient to obtain, modify data . But this also brings many problems to multi-threaded programming. We have to beware of multiple different processes to access the same variable. Many functions are not renewable, namely multiple copies of a function (unless different data segments) are used. Static variables declared in functions often bring problems, and the return value of the function will also have problems. Because if returned is the address of the static declaration of the function, when a thread calls the function to get the address, the address points to data, the other threads may call this function and modify this segment. The variables shared in the process must be defined with the keyword volatile, which is to prevent the compiler from being optimized (such as using the -Ox parameter in the GCC) to change their usage. To protect the variables, we must use semaphores, mutual exclusion methods to ensure that we use the correct use of variables. Below, we gradually introduce relevant knowledge when processing thread data. 4.1 Thread data In a single thread, there are two basic data: global variables and local variables. But in multithreaded programs, there are also third data types: thread data (TSD: Thread-Specific Data). Its and global variables are very elephant. Inside the thread, each function can call it like global variables, but it is invisible to other threads outside the thread. The necessity of this data is obvious. For example, our common variable errno, it returns the standard error message. It obviously can't be a local variable, almost every function should call it; but it can't be a global variable, otherwise it is probably the error information of the B thread in the A-thread. To implement the variables such as this, we must use thread data. We create a key for each thread data, it is associated with this button. In each thread, this button is used to refer to thread data, but in different threads, the data represented by this key is different. In a thread, it represents the same data content. And thread data related functions have four: create a key; specify thread data for a key; read thread data from one key; delete key. The function prototype of the creation button is: extern int pthread_key_create __p ((pthread_key_t * __ key, void (* __ destr_function))); the first parameter is a pointer to a key value, the second parameter indicates a DEStructor function. If this parameter is not empty, then when each thread ends, the system will call this function to release the memory block that is bound to this key. This function is often used with the function pthread_once (pthread_once_t * overce_control, void (* initRoutine))), in order to make this button only created once.
The function pthread_once declares an initialization function. When you call PTHREAD_ONCE for the first time, it performs this function, and the future call will be ignored. In the following example, we create a key and associate it with a certain data. We want to define a function createWindow, which defines a graphics window (data type is fl_window *, which is the data type in the Graphical Interface Development Tool FLTK). This function is called because each thread is called, so we use thread data. / * Declare a key * / pthread_key_t myWinKey; / * Function createWindow * / void createWindow (void) {Fl_Window * win; static pthread_once_t once = PTHREAD_ONCE_INIT; / * invoke the function createMyKey, creating the key * / pthread_once (& once, createMyKey); / * WIN points to a newly established window * / win = new fl_window (0, 0, 100, 100, "myWindow"); / * Do some possible settings for this window, such as size, location, name, etc. * / SetWindow (WIN); / * Bind the window pointer value on the key mywinkey;} / * function createmykey, create a button, and specify DESTRUCTOR * / VOID CREATEMYKEY (VOID) {pthread_KeyCreate & myWINKEY, FREEWINKEY);} / * Function freewinkey, release space * / void freewinkey (fl_window * win) {delete win;} Call function CreateMywin in different threads, you can get window variables visible inside the thread, This variable is obtained through a function pthread_getspecific. In the above example, we have used functions PTHREAD_SETSPECific to bind thread data and a key. The prototypes of these two functions are as follows: extern int pthread_setspecific __P ((pthread_key_t __key, __ const void * __ pointer)); extern void * pthread_getspecific __P ((pthread_key_t __key)); Parameter meaning and use of these two functions is evident. It should be noted that when using PTHRead_setspecific to specify a new thread data for a key, you must release the original thread data to reclaim the space. This process function pthread_key_delete is used to delete a key, and the memory occupied by this key will be released, but it should also be noted that it only releases the memory occupied by the key, does not release the memory resources occupied by the thread data associated with the key, and It does not trigger the DEStructor function defined in the function pthread_key_create. The release of thread data must be done before the release button. 4.2 Mutual exclusion locks The mutex is used to ensure that only one thread is executed for a period of time. The necessity is obvious: assume that each thread writes data to the same file, and finally the result is certain. Let's take a look at the following code. This is a read / write program that uses a buffer and we assume that a buffer can only save a message. That is, the buffer is only two status: information or no information.
void reader_function (void); void writer_function (void); char buffer; int buffer_has_item = 0; pthread_mutex_t mutex; struct timespec delay; void main (void) {pthread_t reader; / * define a delay time * / delay.tv_sec = 2; delay .tv_nec = 0; / * with default attributes initialize a mutex lock object * / pthread_mutex_init (& mutex, NULL); pthread_create (& reader, pthread_attr_default, (void *) & reader_function), NULL); writer_function ();} void writer_function (void ) {while (1) {/ * lock the mutex * / pthread_mutex_lock (& mutex); if (buffer_has_item == 0) {buffer = make_new_item (); buffer_has_item = 1;} / * open the mutex * / pthread_mutex_unlock (& mutex ); pthread_delay_np (& delay);}} void reader_function (void) {while (1) {pthread_mutex_lock (& mutex); if (buffer_has_item == 1) {consume_item (buffer); buffer_has_item = 0;} pthread_mutex_unlock (& mutex); pthread_delay_np ( (});} This declares that the mutex variable MUTEX, the structure pthread_mutex_t is the non-disclosed data type, which contains a system allocated attribute object. The function pthread_mutex_init is used to generate a mutex. NULL parameters indicate the use of default properties. If you need to declare a mutex lock, you must call the function pthread_mutexattr_init. Function pthread_mutexattr_setpshared and functions PTHREAD_MUTEXATTR_SETTYPE Used to set mutex properties. The previous function sets the properties PShared, which has two values, pthread_process_private and pthread_process_shared. The former is used to synchronize threads in different processes, and the latter is used to synchronize the different threads of this process. In the above example, we use the default attribute pthread_process_produate. The latter is used to set the mutex type, and the optional type has pthread_mutex_normal, pthread_mutex_rectect, pthread_mutex_recursive and pthread _mutex_default. They define different above, unlock mechanisms, in general, select the last default attribute. PTHREAD_MUTEX_LOCK declaration begins with a mutually exclusive lock, after which the code is called until pthread_mutex_unlock, it is locked, which can only be modified by one thread at the same time. When a thread is executed to the pthread_mutex_lock, if the lock is used by another thread, this thread is blocked, that is, the program will wait until another thread releases this mutex. In the above example, we use the pthread_delay_np function, let the thread sleep for a while, just to prevent a thread from always occupying this function.
The above example is very simple, it is no longer introduced, it is necessary to propose that there is a deadlock in the process of using the mutex: two threads at the same time, and lock according to different order. Mutually exclusive locks, such as two threads need to lock mutex 1 and mutex 2, and the A thread is first locked in the mutex 1, and the B thread locks the mutex 2, and there is a deadlock. At this time we can use the function pthread_mutex_trylock, it is a non-blocking version of the function pthread_mutex_lock, when it is unavoidable, it returns the corresponding information, and the programmer can make a corresponding processing for the deadlock. In addition, different mutex types are different for deadlocks, but the main thing is to be programmers to pay attention to this in programming. 4.3 Conditional Variables We tell how to use the mutex to achieve the sharing and communication of the thread data, the mutex lock is that it is only two states: lock and non-locking. The condition variables make up for the insufficient mutex lock by allowing the thread blocking and waiting for another thread to send signals, which often uses the mutex. When used, the condition variable is used to block a thread. When the condition is not met, the thread tends to unlock the corresponding mutex and waits for changes. Once other thread changes the conditional variable, it will notify the corresponding condition variable to wake one or more threads that are being blocked by this condition variable. These threads will reslide the mutex lock and re-test whether the conditions are met. In general, the conditional variable is used to perform synchronization between the linear. The structure of the condition variable is pthread_cond_t, and the function pthread_cond_init () is used to initialize a conditional variable. Its prototype: extern int pthread_cond_init __P ((pthread_cond_t * __ cond, __ const pthread_condattr_t * __ cond_attr)); where cond is a pointer to a structure of pthread_cond_t, cond_attr is a pointer to a structure of pthread_condattr_t. Structure PTHREAD_CONDATTR_T is the properties structure of condition variables, and the mutex is the same. We can use it to set the condition variable is available or within the process, and the default value is pthread_ process_private, that is, this condition variable is used by each thread in the same process. Note that the initialization condition variable can be reincarily or released when it is not used. The function that releases a conditional variable is pthread_cond_ destroy (pthread_cond_t cond). Functions pthread_cond_wait () encloses threads on a conditional variable. Its function prototype is: extern int pthread_cond_wait __p (pthread_mutex_t * __cond, pthread_mutex_t * __mutex)); thread unchord the lock pointing to by Mutex and is blocked by condition variable COND. The thread can be woken up by the function pthread_cond_signal and function pthread_cond_broadcast, but it is to be noted that the conditional variable is just the role of blocking and wake-up threads, and the specific judgment criterion requires the user to give it, such as whether a variable is 0, etc., this is from It can be seen in the following example. After the thread is awakened, it will retrieve whether the judgment is satisfied, if it is not satisfied, generally speaking that the thread should still block it here, waiting for being awakened next time. This process generally implements the WHILE statement.
Another function block threads are pthread_cond_timedwait (), which prototype is: extern int pthread_cond_timedwait __P ((pthread_cond_t * __ cond, pthread_mutex_t * __ mutex, __const struct timespec * __ abstime)); it () one more time than the function pthread_cond_wait Parameters, after the ABSTIME segmentation, even if the condition variable is not satisfied, the blocking is also released. The prototype of the function pthread_cond_signal () is: extern int pthread_cond_signal __p (pthread_cond_t * __)); it is used to release a thread that is blocked on the condition variable COND. When multiple threads are blocked on this condition variable, which thread is awakened is determined by the scheduling policy of the thread. It should be noted that this function must be protected with a mutex lock of the condition variable, otherwise the condition satisfies the signal and the test condition and call the pthread_cond_wait function, resulting in unlimited waiting. Below is a simple example of using functions pthread_cond_wait () and functions pthread_cond_signal (). pthread_mutex_t count_lock; pthread_cond_t count_nonzero; unsigned count; decrement_count () {pthread_mutex_lock (& count_lock); while (count == 0) pthread_cond_wait (& count_nonzero, & count_lock); count = count -1; pthread_mutex_unlock (& count_lock);} increment_count () {pthread_mutex_lock ( & count_lock); if (count == 0) pthread_cond_signal (& count_nonzero); count = count 1; pthread_mutex_unlock (& count_lock);} count value is 0, decrement function is blocked at pthread_cond_wait, and open the mutex count_lock. At this time, when the function increment_count is called, the PTHREAD_COND_SIGNAL () function changes the condition variable to inform Decrement_count () stop blocking. The reader can try two threads running these two functions, see what kind of result will occur. Function pthread_cond_broadcast (pthread_cond_t * Cond) is used to wake all threads that are blocked on the conditional variable COND. These threads will be woken up and will compete again to compete again, so this function must be carefully used. 4.4 Semic Semic Signals Essentially a non-negative integer counter that is used to control access to public resources. When the public resource is increased, the call function SEM_POST () increases the amount of semaphore. Only public resources can be used when the signal value is greater than 0, and the function SEM_WAIT () can be used.
Function SEM_TRYWAIT () and functions PTHREAD_ MUTEX_TRYLOCK () the same role, it is a non-blocking version of function SEM_WAIT (). Below we introduce some of the functions related to the quantity by one by one, they are defined in header file /usr/include/semaphore.h. The data type of the semaphore is structural SEM_T, which is essentially a long integer. Function SEM_INIT () is used to initialize a semaphore. Its prototype is: Extern int SEM_INIT __P ((SEM_T * __ SEM, INT __PSHARED, Unsigned INT __VALUE)); SEM is a pointer to the quantity structure; PShared is not 0, this signal is shared between processes, otherwise it can only Sharing all threads of the current process; Value gives the initial value of the signal. Function SEM_POST (SEM_T * SEM) is used to increase the value of the signal. When a thread is blocked on this session, calling this function will make one of the threads are not blocked, and the selection mechanism is also determined by the scheduling policy of the thread. Function SEM_WAIT (SEM_T * SEM) is used to block the current thread until the value of the Sem SEM is greater than 0, and the value of the SEM is reduced after the block is released, indicating that the public resource is reduced. Function SEM_TRYWAIT is a non-blocking version of function SEM_WAIT (), which minimizes the value of Sem Sem. Function SEM_DESTROY (SEM_T * SEM) is used to release semapons SEM. Let's take a look at an example of using a semaphore. In this example, there are 4 threads, two threads are responsible for reading data from the file to the common buffer, and the other two threads read data from the buffer (add-to-pending).
/ * File sem.c * / # include #include #include #define maxstack 100int stack [maxStack] [2]; int size = 0; SEM_T SEM; / * read data from file 1.DAT, each read once, signal Add one * / void readdata1 (void) {file * fp = fopen ("1.dat", "r"); while (! Feof (fp)) {fscanf (FP, "% D% D", & Stack [ Size] [0], & stack [size]); SEM_POST (& SEM); size;} fclose (fp);} / * read data from file 2.DAT * / void readdata2 (void) {file * fp = fopen ("2.dat", "r"); while (! feof (fp)) {fscanf (FP, "% D% D", & stack [size] [0], & stack [size] [1 ]); SEM_POST (& SEM); Size;} fclose (fp);} / * blocking waiting buffer has data, after reading data, release space, continue waiting * / void handleData1 (void) {while (1) {SEM_WAIT (& SEM); Printf ("Plus:% D % D =% D / N", Stack [Size] [0], Stack [Size] [1], Stack [Size] [0] Stack [SIZE] [1]); - size;}} void handledata2 (void) {while (1) {sem_wait (& sem); Printf ("Multiply:% D *% D =% D / N", Stack [Size] [0 ], Stack [Size] [1], Stack [Size] [0] * stack [size] [1]); - size;}} int main (void) {pthread_t T1, T2, T3, T4; SEM_INIT ( & sem, 0,0); pthread_create (& t1, NULL, (void *) HandleData1, NULL); pthread_create (& t2, NULL, (void *) HandleData2, NULL); pthread_create (& t3, NULL, (void *) ReadData1, NULL Pthread_create (& T4, NULL) , (void *) readdata2, null); / * Prevent the program from exadring too early, let it wait for * / pthread_join (T1, NULL) here;} Under Linux, we use the command gcc -lpthread sem.c -o SEM generates an executable SEM. We edit data files in advance 1.DAT and 2.Dat, assuming their contents are 1 2 3 4 5 6 7 8 9 10 and -1 -2 -3 -4 -5 -6 -7 -8 -9 - 10, we run SEM to get the following results: multiply: -1 * -2 = 2Plus: -1 -2 = -3multiply: 9 * 10 = 90plus: -9 -10 = -19Multiply: -7 * -8 = 56Plus: -5 -6 = -11 Multiply: -3 * -4 = 12plus: 9 10 = 19plus: 7 8 = 15plus: 5 6 = 11 We can see the competitive relationship between each thread. The value does not appear in our original order this is because the value of Size is revised by each thread. This is often a problem that multi-threaded programming should pay attention.