Thread pool helps achieve best resource utilization
http://www-900.ibm.com/developerWorks/cn
Brian Goetz (Brian@quiotix.com) Chief Consultant, Quiotix Corp 2002 October
Why do you want to use a thread pool? Many server applications such as web servers, database servers, file servers, or mail servers are processed to handle a lot of short-range tasks from certain remote sources. Request in some way to reach the server, which may be through a network protocol (such as HTTP, FTP or POP), through the JMS queue or may pass polling the database. Regardless of how the request is reached, the case where the server application has occurred is that the single task process is very short and the number of requests is huge.
Building a server application an over-simple model should be: Whenever a request arrives, create a new thread and then serve in the new thread. In fact, this method is working very well for prototyping, but if you try to deploy server applications running in this way, then the serious shortcomings of this approach is obvious. One of the short-per-request methods is: Creating a new thread for each request is very large; a server that is requested to create a new thread is spent on the creation and destruction of the thread. The time and consumption system resources are more than the time and resources requested by the flow of the actual user.
In addition to creating and destroying the overhead of the thread, the moving thread also consumes system resources. Creating too many threads in a JVM may cause the system to run over memory or "switch excessive" due to over-consumption of memory. In order to prevent industries from resources, the server application requires some way to limit the number of requests handled any given time.
The thread pool provides a solution for thread lifecycle overhead issues and weak resources. By reuse threads for multiple tasks, the overhead created by threads is spread to multiple tasks. The benefit is that the thread has existed when the request arrives, so it is unintentionally eliminated by the latency of thread creation. In this way, you can immediately use the request to respond faster. Moreover, by appropriately adjusting the number of threads in the thread pool, it is to force other new requests until the number of requests exceed a certain threshold until a thread is obtained, so that the resources can be prevented from being insufficient.
Alternatives for thread pools The thread pool is far from the only way to use multiple threads within the server application. As mentioned above, sometimes it is very sensible to generate a new thread for each new task. However, if the task creates too frequent, the average processing time is too short, then generating a new thread for each task will result in performance issues.
Another common thread model is assigned a rear-end thread and task queue for a type of task. AWT and SWING use this model, there is a GUI event thread in this model, causing all the work that changes in the user interface must be executed in the thread. However, since only one AWT thread, it is necessary to take a task in the AWT thread to complete, which is not available. Therefore, Swing applications often require additional working threads for long running times, with the UI tasks.
Each task corresponds to a thread method and a single-background-thread method works ideal in some cases. Each task is a thread method works very well when only a few tasks of a long time. And as long as the scheduling predictability is not important, a single bacultiple thread method is very good, such as the low priority background task is this. However, most server applications are for processing a large number of short-term tasks or sub-tasks, so it is often desirable to have a mechanism that can effectively handle these tasks with low overhead and some resource management and timing predictability. Thread pools provide these advantages.
The work queue is concerned about the actual implementation of the thread pool, and the term "thread pool" makes people misunderstand because the thread pool "obvious" implementation does not necessarily produce the results of our hopes in most cases. The term "thread pool" appeared before the Java platform, so it may be a product of less object-oriented methods. However, the term continues to be widely used. Although we can easily implement a thread pool class, the client class waits for a usable thread to pass the task to the thread to execute, and then return the thread to the pool when the task is completed, but this method has several potential Negative impact. For example, what happens when the pool is empty? The caller attempting to pass the task to the pool thread will find that the pool is empty, and when the caller is waiting for an available pool thread, its thread will block. One of the reasons why us to use the background thread is often blocked to prevent the thread being submitted. Completely block the caller, such as the "obvious" implementation of the thread pool, you can eliminate the problems we try to solve.
We usually want the work queue combined with the same set of fixed work threads, which use Wait () and notify () to notify the wait for the new work that has arrived. This work queue is usually implemented to have some piece of linked list with related monitor objects. Listing 1 shows an example of a simple combination work queue. Although the Thread API does not have special requirements for using the Runnable interface, this model using the Runnable object queue is the public agreement of the scheduler and work queue.
Listing 1. Work queue with thread pool PUBLIC CLASS WORKQUE {Private Final Int Nthreads; Private Final Poolworker [] threads; private final linkedlist queue;
Public WorkQueue (int nthreads) {this.nthreads = nthreads; queue = new linkedList (); threads = new poolkerker [nthreads];
For (int i = 0; i
Threads [i] = new poolworker ();
Threads [i] .start ();
}
}
Public void execute (runnable r) {synronized (queue) {queue.addlast (r); queue.notify ();}}
Private Class Poolker Extends Thread {Public Void Run () {Runnable R;
While (true) {synchronized (queue.isempty ()) {Try {queue.wait ();} catch (interruptedException ignored) {}}
R = (runnable) queue.removefirst ();
// if we don't catch runtimeException, // the pool could leak threads try {r.Run ();} catch (runtimeexception e) {// you might want to log something heren}}}}} You may have The implementation of the list 1 is notify () instead of notifyall (). Most experts recommend NOTIFYAll () rather than notify (), and the reason is very good: using Notify () has a risk-up risk, only in certain conditions, it is appropriate. On the other hand, if it is used, notify () has a more desirable performance characteristic than NOTIFYAll (); in particular, the environment switching caused by Notify () is much more important, which is important in the server application.
The sample work queue in Listing 1 meets the need for safe use of Notify (). So, please continue, use it in your program, but use notify () in other cases, please be careful.
Risk using the thread pool Although the thread pool is a powerful mechanism to build multi-threaded applications, it is not risky to use it. Applications built with thread pools are prone to all concurrent risks that are easily affected by any other multi-threaded applications, such as simultaneous errors and dead locks, it is also easy to suffer from a few other risks specific to the thread pool, such as deadlocks related to the pool. Resource shortage and thread leakage.
Any multi-threaded app is deadlocked by any multi-threaded application. When each of a set of processes or threads are waiting for an event that can only be caused by another process in the group, we say this group of processes or threads. The simplest situation of the deadlock is: thread a holds the exclusive lock of the object X, and is launched by the lock waiting object Y, and the thread b holds the exclusive lock of the object Y, but the lock waiting for the object X. Unless there is a way to break the lock, the deadlock will be waited forever.
Although there is a risk of deadlocks in any multi-threaded program, the thread pool introduces another deadlock. In that case, all pool threads are executing the execution result of another task in the blocked waiting queue. Tasks, but this task cannot be run because there is no thread that is not occupied. When the thread pool is used to implement analog to many interactive objects, the analog objects can be sent to each other. These queries are next to the queued task, and the query object is simultaneously waiting for a response. This happens.
One advantage of resource lacking thread pool is that they are usually implemented well relative to other alternative schedule mechanisms (some we have already discussed). But only the thread pool size is like appropriate is like this. Thread consumption includes a large number of resources in memory and other system resources. In addition to the memory required for the Thread object, each thread requires two possible to perform call stacks. In addition, JVM may create a host thread for each Java thread, which will consume additional system resources. Finally, although the scheduling overhead between the thread is small, if there is a lot of threads, environmental switches may also seriously affect the performance of the program.
If the thread pool is too large, the resources consumed by those threads may seriously affect system performance. Switching between threads will waste time, and use exceeding the thread that can cause resources than you can cause resource shortage, because the pool thread is consumed to be consumed, and these resources may be more efficiently utilized by other tasks. In addition to the resources used by the thread itself, the work done when the service request may require additional resources, such as JDBC connectivity, socket or files. These are all limited resources, and too many concurrent requests may also cause fail, such as the JDBC connection. Concurrent error thread pool and other queuing mechanisms rely on WAIT () and Notify () methods, which are difficult to use. If the coding is incorrect, it is possible to lose notifications, causing the thread to remain idle, although there is work to be processed in the queue. When using these methods, you must be extravagant; even experts may also be wrong above. It is best to use the existing, known to work, such as the Util.concurrent package discussed in your own pool without writing the following.
A serious risk in various types of thread leaks is a thread leak that occurs when the thread is removed from the pool to perform a task, and this thread does not return the pool after the task is completed. One situation in which thread leaks occur occurs when the task throws a runtimeException or an error. If the pool is not captured, the thread will only exit and the size of the thread pool will be permanently reduced. When this happens, the number of thread pools is ultimately empty, and the system will stop because there is no available thread to handle the task.
Some tasks may always wait for certain resources or inputs from the user, and these resources cannot guarantee that the user may have returned home, and the task such as this will stop permanently, and these stops will also cause and thread Leak the same problem. If a thread is permanently consumed by such a task, then it is actually removed from the pool. For such tasks, they should only give them their own threads, or only let them wait for a limited time.
The request overload is just a request to serve the server, which is possible. In this case, we may not want to queue every arrival request to our work queue, because the tasks waiting for execution in the queue may consume too much system resources and cause resource lacking. In this case, it is determined how to do yourself; in some cases, you can simply discard the request, rely on a higher level agreement, then retry the request later, you can also use a response to the server temporarily very busy To refuse the request.
Guidelines for efficient use of thread pool As long as you follow several simple guidelines, the thread pool can be an extremely effective way to build a server application:
Don't queue the tasks that are simultaneously waiting for other tasks. This may result in the form of deadlocks described above, in that deadline, all threads are occupied by some tasks, these tasks wait for the result of queuing tasks, and these tasks cannot be performed because all The thread is very busy.
Be careful when using a suitable process for a long time. If the program must wait for such a resource such as I / O, specify the longest wait time, and then the failure is still resilible to re-queue the task to execute later. This guarantees: By release a thread to a task that may be successful, it will eventually achieve some progress.
Understand the task. To effectively adjust the size of the thread pool, you need to understand the tasks that are waiting in line and what they are doing. Are they CPU-Bound? Are they I / O restrictions (I / O-BOUND)? Your answer will affect how you adjust the app. If you have a different task class, these classes have a very different feature, then set multiple job queues for different task classes, which can be adjusted accordingly. Adjusting the size of the pool adjusts the size of the thread pool basically avoids two types of errors: the thread is too small or the thread is too much. Fortunately, for most applications, there are rooms between too many and too few times fairly wide. Memories: There are two main advantages in the application in the application, although it is waiting for slow operation such as I / O, it is allowed to proceed, and the multiprocessor can be utilized. In applications running on computing restrictions on N processor machines, add additional threads when the number of threads is close to N, may improve total processing power, and add additional threads when the number of threads exceed N, will not work . In fact, too many threads will even reduce performance because it causes additional environment to switch overhead.
The optimum size of the thread pool depends on the number of available processors and the nature of the tasks in the work queue. If there is only one work queue on a system with N processors, all of which is the task of computing properties, which generally obtains the maximum CPU utilization when the thread pool has N or N 1 thread.
For those tasks that may need to wait for I / O (for example, the task from the HTTP request from the socket), you need to make the size of the pool over the number of available processors, because not all threads have been working. By using an out-of-analysis, you can estimate the ratio between a typical request for the waiting time (WT) and service time (ST). If we refer to this ratio WT / ST, for a system with N processors, it is necessary to set up approximately N * (1 wt / st) threads to keep the processor to get fully utilization.
The processor utilization is not the only consideration in the process of adjusting the size of the thread pool. With the growth of thread pool, you may touch the scheduler, available memory restrictions, or other system resources, such as the number of sockets, open file handles or database connections.
No need to write your own pool Doug Lea to prepare an excellent concurrent utility open source library Util.Concurrent, which includes mutually exclusive, semaphore, such as a collection class, such as a very good queue and hash table in concurrent visit, and Several work queues are implemented. The PooledExecutor class in this package is an effective implementation of the thread pool that is widely used in the work queue. You don't have to try to write your own thread pool, which is easy to make mistakes. Instead, you can consider using some utilities in Util.concurrent. See References for links and more information.
Util.Concurrent library also inspired JSR 166, JSR 166 is a Java Community Process (JCP)) Working Group, they are planning to develop a group contained in the Java class library under java.util.concurrent package. Utility, this package should be used in Java Development Toolbox 1.5 release.
Conclusion Thread pool is a useful tool for organizing server applications. It is very simple in concept, but when implementing and using a pool, you need to pay attention to several issues, such as deadlock, low resource shortage and WAIT () and NOTIFY () complexity. If you find that your application needs a thread pool, consider using an Executor class in Util.Concurrent, such as PooledExecutor, without having to start from scratch. If you want to create a thread to handle a short task for your life, you should definitely consider using a thread pool to replace. Reference
Doug LEA's Concurrent Programming in Java: Design Principles and patterns, Second Edition is a monograph that surrounds multithreaded programming in the Java application.
Discussion Java multi-threaded issues on the multi-thread Java programming forum hosted by Brian Goetz.
Explore the Doug LEA's Util.concurrent package that contains a wide range of useful classes for building a valid concurrent application.
Util.concurrent package is normalized under Java Community Process JSR 166 to include in the 1.5 release of JDK.
Allen Holub's book Taming Java Threads is an interesting introduction to the challenge of Java thread programming.
There is definitely in the Java Thread API; please read if Allen Holub is king, what will he do (developerWorks, October 2000).
Alex Roetter provides a criteria for writing thread secure classes (DeveloperWorks, February 2001).
Read all columns in Brian Goetz Java Theory and Practice.
Find other Java reference materials on the developerWorks Java technology area.
About the author Brian Goetz is a software consultant. In the past 15 years, he has always been a professional software developer. He is the Chief Consultant of Quiotix, and Quiotix is a software development and consulting firm in Los Altos, Gani Fortress. Please check the briastic publishing and upcoming articles in popular industry publications. Can contact Brian@quiotix.com and Brian.