Optimize the performance of Java garbage collection

xiaoxiao2021-03-06  48

content:

Introduction Management Overview Analysis Verbosegc output Correct Setting Stacks The size of the stack avoids the stack failure marker stack overflow to get rid of Finalizer to avoid very large allocation debris and its cause needs concurrency mark? The switch to the author should be avoided. Reference information about the author's evaluation

related information:

SENSIBLE SANITATION: Understanding The IBM Java Garbage Collector, PART 1

subscription:

DeveloperWorks News DeveloperWorks Subscribe (Subscribe CD and Download)

How to use IBM Java virtual machine detection and solve garbage collection problems

Level: primary

SUMIT Chawla

(SUMITC@us.ibm.com) Technical Director, EServer Java Enablement, IBM 2005 February 2005

Does your Java app take advantage of the ability to run the IBM ESERVER hardware? In this article, the author will introduce how to judge

Garbage Collection - Remarks of the Java Virtual Machine Required Non-Using Space - whether it is adjusted to the best state. Then he will provide some suggestions to solve garbage collection problems.

Introduction Garbage collection is the key to the excellent performance of IBM Java Virtual Machine (JVM). Other JVMs require a lot of adjustments to provide optimal performance, and IBM JVM uses its "unpacking" default settings, which can work well in most cases. However, in some cases, the performance of garbage collection will be inexplicably lowered. As a result, it may cause the server to be in response, and the screen is still constant or completely failed, and is often accompanied by "severe anecdotic space" such a vague message. Fortunately, most of the cases are easy to find why, usually, it is often easy to correct errors.

This article will show how to determine the potential causes of performance degradation. Because garbage collection is a very large, very complex topic, this paper is discussed on the basis of a group of related articles (see Referring). Although most suggestions discussed in this paper treat Java programs as a black box, there are still some views to be used when designing or encoding to avoid potential problems.

The information provided herein is applicable to all IBM EServer platforms, in addition to the reset JVM configuration (see Resources). Unless otherwise stated, the examples in the article are taken from Java 1.3.1 Build CA131-20020706 running on the four-way control AIX 5.1.

Stack Management Overview JVM allocates a heap in the initialization process. The size of the heap depends on the specified or default minimum and maximum and the use of the heap. Allocating stacks may help visualizes, as shown in Figure 1, which shows HeapBase, Heaplimit, and Heaptop.

Figure 1. Concept view of a pile

Heapbase represents the bulk, Heaptop indicates the maximum absolute value of the heap to grow. Heaptop - HeapBase is determined by the command line parameter-xmx. This parameter and other command line parameters are described in the developerWorks documentation on verbosegc and command line parameters. The Heaplimit pointer can rise as the heap is increased, and the shrinkage of the heap is lowered. Heaplimit can never exceed Heaptop, or less than the initial heap size specified by -XMS. The size of the heap is Heaplimit - HeapBase.

If the free space ratio of the entire stack is lower than the value specified by -Xminf (Minf is the minimum free space), the heap will extends. If the free space ratio of the entire stack is higher than the value specified by -Xmaxf (MAXF is the largest free space), the heap will shrink. The default values ​​of -Xminf and -Xmaxf are 0.3 and 0.6, respectively, so JVM always attempts to maintain the free space ratio of the stack between 30% and 60%. Parameters -Xmine (Mine is minimum extended size) and -Xmaxe (MAXE are the maximum extended size) to control the incremental increment. These four parameters do not work on fixed-size stacks (start JVM with equal-xms and -xmx values, which means heaplimit = heAptop) because the fixed size stack cannot be extended or contracted. When the Java thread requests to store, if the JVM cannot assign enough memory blocks to meet this request, it can be said that allocation failure (AF). It is inevitably to collect garbage at this time. Garbage collection includes all "unreachable" references to reuse the space they occupy. The garbage collection is executed by the request allocated, which is a STOP-The-World (STW) mechanism; when the garbage collection algorithm is executed, all other threads of the Java application (except for the garbage collection helper thread) are hang.

IBM implements a garbage collection algorithm called Mark-Sweep-Compact (MSC), which is named according to three different stages.

mark

Table and all "rolled" or active objects. This phase begins with the "root" start, such as objects on the thread stack, Java Native Interface (JNI) local reference and global reference, etc., and then recursively along each reference until all references are tagged.

Clean up

Clear all objects that have been assigned but unmarked, and the space used to recover these objects.

compression

Move the active object to together and remove the holes and debris in the heap.

For details on the MSC algorithm, see Resources.

Parallel and concurrent although garbage collection itself uses STW mechanism, the latest IBM JVM version uses multiple "Help" threads on multiprocessor machines to reduce time spending each phase. Therefore, by default, JVM 1.3.0 uses parallel mode in the tag phase. JVM 1.3.1 uses parallel mode in the mark and cleanup phase, and an optional concurrent mode is also supported in the tag phase, and you can use the command line switch -Xgcpolicy: OptavGPAUSE to switch. When writing this article, there is an incremental compression pattern in the latest version of JVM 1.4.0, which is parallelized. When discussing these modes, it is important to understand the difference between parallel and concurrency.

In multiprocessor systems with N CPUs, the JVM that supports parallel mode will start N-1 garbage collection help thread when initialization. When running the application code, these threads have been idle, and they only call them when the garbage is started. At a particular phase, some work will be assigned to the threads and help threads that drive garbage collection, so there is a total of N threads running on the N-CPU machine in parallel. If you want to ban parallel mode, the only way is to use the -xgcthreads parameter to change the starting garbage collection help thread.

When using concurrent mode, JVM will start a background thread (not conducive to garbage collection help thread), while performing application threads, part of the work is done in the background. The background thread will try to perform concurrently with the application to complete all the operations of the garbage collection, so the suspension caused by STW is reduced when the garbage collection is executed. However, in some cases, concurrent processing may have a negative impact on performance, particularly for CPU-sensitive applications. The following table lists the processing types of the garbage collection in accordance with the JVM version.

Mark cleaning compressed IBM JVM 1.2.2xxXibm JVM 1.3.0PXXIBM JVM 1.3.1P, CPXIBM JVM 1.4.0P, CPP

among them:

X

Single-threaded operation.

P

Parallel operation (all helper threads during garbage collection are working).

C

Concurrent operation (background thread and application threading concurrent operation).

Analysis of Verbosegc outputs Although there are also analytical procedures and other third-party tools, this article only discusses the analysis of Verbosegc logs. These logs are generated by JVM in the specified -verbosegc command line parameter, which is a very reliable independent platform debugging tool. To get a complete Verbosegc syntax, see "Verbosegc and Command-Line Parameters".

Enabling Verbosegc may have a certain impact on the performance of the application. If this impact is unacceptable, you should use the test system to collect the Verbosegc log. Server applications typically have always been activated in the VERBOSEGC. This is a good way to monitor whether the entire JVM is working well. In the case of an OutofMemory error, this method has an unaffected value.

In order to effectively analyze the Verbosegc record, the energy must be concentrated on related information and filter out "noise". By writing scripts from a long VerbosegC tracking record, it is not difficult to extract information, but the format of these records (and usually true) varies from different JVM versions. The following example expressed important information with a bold or blue font. Even if the format of the record seems to be large, it is easy to find this information in the Verbosegc log.

Are you refreshed? Before trying to this article, it is highly recommended that you upgrade to the latest JVM service refresh (SR). Every time a new service refresh will have a lot of corrections and improvements, and new service refreshes can improve the performance and stability of JVM. Migrate to the latest version (such as JVM 1.4.0 or 1.3.1, which provides enhanced performance characteristics) based on the platform used. Be sure to install all the necessary OS patches (such as the maintenance level on AIX) for JVM. This information is recorded in the ReadMe file provided with SDK / JRE.

Correctly set the size of the stack to calculate the correct heap size parameters easy, but it may have a big impact on the application startup time and runtime performance. The initial size and maximum values ​​are controlled by parameters -Xms and -XMX, which are typically set according to an estimate of the use of the reaps in the weight of the weight, but verbosegc can help determine these values, and avoid guessing. Below is the initialization (or entering the Ready "state) from starting to the completion program, the Verbosegc output of an application is as follows.

= 32), Weak 5, Final 237, Phantom 0>

The above records indicate that when AF occurs for the first time, the free space in the heap is 0% (0 bytes available in 3983128). In addition, after the first garbage collection, the free space ratio rises to 34%, slightly higher than the -Xminf tag (default is 30%). Depending on the application of the application, the initial heap to allocate greater initial stacks using -XMS may be better. It is almost certain that the application in the previous example causes a heap extension when AF is next AF. Allocation greater initial stacks can avoid this. Once the application enters the READY state, it usually does not encounter AF again, so it is also determined that a better initial heap size is determined. Similarly, the -XMX value that avoids the OutofMemory error can be detected by increasing the application load.

If the heap is too small, even if the application does not use many objects for a long time, there will be frequent garbage collection. Therefore, there is naturally a tendency to use a lot of heaps. However, due to platforms and other factors, the maximum size of the heap is limited by physical factors. If the heap is paised, performance will deteriorate sharply, so the size of the heap must not exceed the total amount of physical memory installed on the system. For example, if there is 1 GB of memory on the AIX machine, you should not assign 2 GB of 2 GB of Java applications.

Even if the application runs on the P690 supercomputer with 64 GB of memory, it is not necessarily possible to use -XMX60G (of course, 64-bit JVM). Although the application may not encounter AF for a long time, the pause caused by STW will be difficult to deal with once AF. The following records were assigned from the 32 GB AIX system to 64-bit JVM 1.3.1 (Build CAIX64131-20021102), which showed the impact of large stacks in this regard.

(3145728/3145728)>

IN 4749 MS>

= 32), Weak 0, Final 1, Phantom 0>

garbage collection is nearly five seconds, not including compression! The time spent on the garbage collection cycle is directly proportional to the size of the heap. A good principle is to set the size of the stack as needed, not the configured it too much or too small.

A common performance optimization technology is the same as the initial heap size (-XMS) is the same as the maximum heap size (-Xmx). Since there is no stack extension and stack shrink, in some cases, doing so can significantly improve performance. Typically, larger differences are set between the initial and maximum heap size when needed to handle a large amount of allocation requests. But to remember, if the -xms100m -xmx100m is specified, the JVM will consume 100 MB of memory throughout the life period, even if the utilization does not exceed 10%.

On the other hand, it is also possible to use -Xinitsh to allocate larger system heaps at the beginning, thereby avoiding an Expanded System HEAP message. But these messages can be completely ignored. The system stack expands as needed, and garbage collection will never have, which only contains those objects that spend the entire life period of the JVM instance.

Avoiding the failure of the stack If the size of the stack of variables (such as, -XMS and -XMX is different), the application may encounter such a situation, and the allocation failure is not expanded. This is the pile failure, because the size of the heap is just avoided, but it is not enough to solve the failure of future distribution. Typically, the space released by the garbage collection cycle can not only meet the current allocation failure, but there are many space available for future distribution requests. However, if the heap is in the failure state, the space per waste collection cycle is just satisfying the current distribution failure. As a result, the next distribution request will enter the garbage collection cycle, and so on. A large amount of survival time is very short. It may also cause this phenomenon.

A way to avoid this cycle is to increase the value of -Xminf and -Xmaxf. For example, if -Xminf.5 is used, the heap will increase to at least 50% free space. Again, increasing -Xmaxf is also very reasonable. If -Xminf.5 is equal to 5, -Xmaxf is the default value 0.6 because the JVM is maintained between 50% and 60% of the free space, so there is too much extension and contraction. The difference is 0.3 is a good choice, so -Xmaxf.8 can match -Xminf.5 very well.

If the record indicates that multiple expansion can reach a stable heap size, but can change -Xmine, set the minimum size of the extended size according to the behavior of the application. The goal is to get enough free space, not only meet the current request, but also meet many of the following requests, thereby avoiding too much garbage collection cycle. -Xmine, -Xmaxf and -Xminf provide a lot of flexibility for controlling the memory usage features of the application.

The most important inspection of the label stack is used to use Verbosegc is no "Mark Stack Overflow" message. The following record shows this message and its impact.

(9584432/10036784)>

= 32), Weak 0, Final 0, Phantom 0>

In the labeling phase of the garbage collection, this message will cause this message if the number of JVM is overflows in the "marker stack" overflow in the JVM. In the tag phase, the garbage collection process uses this stack into all known references to recursively scan each activity reference. The cause of overflow is too much activity object in the heap (or more accurately, the object is nesting), which usually shows the application code defect. This issue is required in the application source code unless you can control the number of active objects in the application (such as some kind of object pool), you need to solve this problem in the application source code. It is recommended to use the analysis tool to determine the reference.

If you cannot avoid a lot of activity reference, concurrent tags may be a viable option.

Getting Returning the record below Finalizer shows an interesting situation: Solving the distribution failure costs 2.78 seconds, which do not include the time used in compression.

(3145728/3145728)>

= 32), Weak 11, Final 549708, Phantom 0>

The culprit is the number of objects that must be ended. Anyway, using Finalizer is not a good idea, although this is inevitable in a specific case, but it should only be a way to use it as an operation that does not implement other methods. For example, avoiding allocation in Finalizer inside.

Avoiding a very large allocation. It is sometimes caused by the status of the stack at the time, but caused by the failure of the assignment. such as:

= 32), Weak 0, Final 0, Phantom 0>

BYTES>

These records come from a very old JVM (accurately: CA130-20010615), so compressed cause (red display) is displayed as 0. But the pile of compressed 256 MB is 1.5 seconds! Why is this bad? Let's take a look at the initial request, and initially requested 912920 bytes - nearly 1 MB.

Allocated memory blocks must be continuous, and as the pile is getting more and more full, it is more and more difficult to find a larger continuous block. This is not just Java's problem, which will also encounter this problem using Malloc in C. JVM reduces fragments by redistributing by redistributing in the compression phase, but its cost is to freeze the time for a long time. The above records indicate that the compression phase has been completed, and the total time to allocate a large piece of space has exceeded 2 seconds.

The following records illustrate a worst case.

= 32), Weak 0, Final 17, Phantom 0>

More bytes>

= 32), Weak 0, Final 0, Phantom 0>

= 32), Weak 0, Final 0, Phantom 0>

BYTES>

The request is a 2 MB of object (2241056 bytes), although 135 MB (135487112) free space in the heap of 1.2 GB (12918444600), but cannot be assigned a 2 MB block. Although all possible search is performed, it took 268 seconds, but there was still no big enough block. And there is still a bad "serious shortage" message, pointing out that JVM has insufficient memory.

The best way is to decompose the allocation request into a smaller block if possible. In this way, larger heap spaces may work, but in most cases, this is just postponed the time of the problem.

Debris and its generations, let's take a look at one of the best in the above:

(177302288/1291844600)>

Although there is a free space of 177 MB, 2 MB of blocks cannot be assigned. The reason is that although the garbage collection cycle can compress the holes in the heap, some content in the heap cannot be reassigned during the compression process. For example, an application may use a JNI assignment and reference object or array. These assignments are fixed in memory, neither reassigned, and cannot be recycled unless the appropriate JNI is used to release them. The IBM Java service team can help determine such references, in which case the tool is also very useful.

Similarly, because the block block is externally referenced in the heap, it is also fixed. Even if there is no fixed object, large allocation generally leads to fragments. Fortunately, such severe debris rarely appear.

Do you need a tag? If the Java application is paused from time to time due to garbage collection, the concurrency tag can help reduce the pause, so that the application is run more stable. But sometimes, the concurrent tag may reduce the throughput of the application. It is recommended to use and prohibit concurrency tags, using the same load to measure the impact on application performance, and compare.

However, the Verbosegc output that observes concurrent tags can provide a large amount of information about acceleration. There is no need to analyze each part of the printed record, meaningful parts include concurrent tags that can successfully scan the probability (Exhausted and Aborted / halted), and how much work can be made.

The following three records belong to the same Java application, which is created at different stages in a run, which shows three different results running concurrently.

The first possible result is that it is issued to remember the exhausted:

23273 MS SINCE LAST CON>

Traced = 57287216 (3324474 53962742) free = 3457752>

13701 (Factor 0.142)>

= 32), Weak 0, Final 5, Phantom 0>

This indicates that concurrency marks work as we expect. EXHAUSTED means that the background thread can complete your work before the allocation failure. Because the background thread scans 3324474 bytes (and 53962742 bytes), the background thread can get enough CPU time to reduce the total tag time. Therefore, the marking phase in STW uses only 51 milliseconds (MS), and the total STW time is only 230 milliseconds. This is very good for 512 MB stacks.

Below is the Aborted and marked operation:

(3145728/3145728)>

= 32), Weak 0, Final 7, Phantom 0>

This is the worst situation. The concurrent tag is terminated, mainly because of the assignment of large objects and calls System.gc (). If the application does so frequently, then it cannot be beneficial from concurrent tags.

Best is halted concurrent tag:

(3145728/3145728)>

= 32), Weak 0, Final 6, Phantom 0>

From the application of concurrency, halted is between Exhausted and Aborted, it indicates that only part of the work is completed. The above record description, no scanning is completed before the next assignment fails. In the garbage collection cycle, the marking phase spent 274 milliseconds, and the total time rose to 414 milliseconds.

In the ideal case, most garbage collection cycles are concurrently collected (with the complicated marks to do their work, tagging as exhausted), rather than the allocation failure. If the application calls system.gc (), there will be a lot of Aborted rows in the record.

In most applications, both the concurrent tag can improve performance and help for "marker stack overflow". However, if the marker stack overflows due to defects, the only solution is to correct defects.

The following command line switches should be avoided should be avoided.

Command line switch Description -XnocompactGC This parameter completely shuts down compression. Although there is a short-term benefits in performance, the final application pile will become bound, even if there is enough free space in the heap, it will cause OutofMemory Error -XcomPactGC to use this parameter will result in compression, whether there is no necessary. JVM should do a lot of decisions when compressed, delay the compression -Xgcthreads this parameter controls the number of garbage collection help threads created during the startup process. For N-processor machine, the default thread is N-1. These threads provide parallel mechanisms in parallel tags and parallel cleanup modes.

Conclusion This article briefly introduces the garbage collection and heap management capabilities of IBM JVM. The later Verbosegc log is likely to provide more useful information.

Summarize the recommendations made in the text:

As long as you may upgrade to the latest JVM version. The error you have encountered may have been discovered and solved. Adjusting -XMS, -XMX, and -Xminf until the Verbosegc output gives an acceptable balance between the number of assigned failed and the number of pauses caused by each garbage collection. Use fixed-size stacks to avoid shrinkage or expansion. If possible, a larger (> 500 kb) block is broken down into smaller blocks. Do not ignore the "Mark Stack overflow" message. Avoid using Finalizer. Try a test and labeled. Ask if it is necessary to call System.gc (), if it is not necessary, delete it.

If you see, this topic can be said to understand the three words. However, you only need to make a phone or send an email, you can get in touch with your good partner IBM Technical Support Team (the link in the reference is a good starting point). They are more clear than any article for the special circumstances you have. Reference

You can see this article in our website on our world. SAM BORMAN About IBM JVM Storage Components Series Articles is the most detailed reference for IBM Java garbage collection:

"Sensible sanitation: Understanding the IBM Java Garbage Collector, Part 1: Object Allocation" (developerWorks, August 2002) "Sensible sanitation: Understanding the IBM Java Garbage Collector, Part 2: Garbage Collection" (developerWorks, August 2002) "SENSIBLE SANITATION: Understanding The IBM Java Garbage Collector, Part 3: Verbosegc and Command-Line Parameters" (September 2002) Please visit Java 2 on the OS / 390 and z / OS Platforms Site for reset JVM For information, see New IBM Technology Featuring Persistent Reusable Java Virtual Machines (PDF). ARTIMA.com's Inside Java 2 Virtual Machine, "Heap of Fish" made a good introduction to the Mark-Sweep-Compact algorithm. If you need help, please visit the IBM EServer Technical Support page. About the application of IBM servers, more technical articles can be found on IBM EServer Developer Domain. The DeveloperWorks Java technology area has hundreds of technical articles and tutorials.

About the author SUMIT CHAWLA provides Java support for ISV in the IBM ESERVER department. You can contact him through Sumitc@us.ibm.com.

转载请注明原文地址:https://www.9cbs.com/read-53814.html

New Post(0)