Garbage collector and Java programming

xiaoxiao2021-03-06  49

Ouyang Chen (Yeekee@sina.com) Zhouxin@sei.pku.edu.cn)

Garbage Collector, GC) is basically transparent to Java programmers, but an excellent Java programmer must understand the working principle of GC, how to optimize GC performance, how to perform limited interaction with GC, Because there are some applications to high performance requirements, such as embedded systems, real-time systems, etc., only fully enhanced memory management efficiency can improve the performance of the entire application. This article first briefly introduces the working principle of GC, and then discuss several key issues of GC, and finally propose some Java program design recommendations to improve the performance of the Java program from the GC angle.

Basic principle of a GC

Java's memory management is actually the management of objects, including the assignment and release of objects.

For programmers, allocated objects use the new keyword; when the object is released, just assign the object all reference to null, let the program can no longer access this object, we call this object "Not arrival". The GC will be responsible for recovering the memory space of all "unreachable" objects.

For GC, when the programmer creates an object, the GC starts to monitor the address, size, and usage of this object. Typically, GC uses all objects in the direction (HEAP) in the manner (see References 1). In this way, it is determined which object is "can be reached", which object is "unreachable". When the GC determines that some objects are "unreachable", GC is responsible for reclaiming these memory spaces. However, in order to ensure that the GC can implement in different platforms, the Java specifications have not strictly regulated a lot of behavior of GC. For example, there is no clear provision for what type of recycling algorithm that is recycling. Therefore, different JVM implementations often have different implementation algorithms. This also brings more uncertainty to the development of Java programmers. This paper studies several issues related to GC work, and strives to reduce this uncertainty to the Negative impact of Java programs.

Two incremental GC (Incremental GC)

GC is usually implemented by one or a set of processes in JVM, which also occupies HEAP space as the user program, and the CPU is occupied. When the GC process is running, the application stops running. Therefore, when the GC run is longer, the user can feel the parking of the JAVA program, and if the GC run time is too short, the possible object recovery rate is too low, which means that there are still many objects that should be recycled without recycling. Still occupying a large amount of memory. Therefore, when designing GC, we must trade between the pause time and the recovery rate. A good GC implementation allows users to define their own settings, such as some memory limited devices, very sensitive to memory usage, I hope that GC can recover memory, it does not slow down. Other real-time online games, you cannot allow programs to have a long interrupt. The incremental GC is through a certain recovery algorithm, divides a long interruption, divided into many small interrupts, reducing the impact of GC on the user program in this way. Although the incremental GC may not be as high as ordinary GC in overall performance, it can reduce the longest stop time of the program.

The following figure shows, comparison of incremental GC and normal GC. The gray portion indicates the time of the thread occupies the CPU.

The HotSpot JVM provided by Sun JDK supports incremental GC. HotSpot JVM Default GC mode is not using incremental GC, in order to start increment GC, we must increase the parameters of -xincg when running a Java program. The implementation of the HOTSPOT JVM incremental GC is a Train GC algorithm. Its basic idea is that all objects in the heap are packet (layers) according to the creation and usage, and the frequently high and relevance objects are placed in a team. As the program is run, the group is constantly Adjustment. When GC is running, it always reclaims the oldest (recently rarely accessible) object, if the whole group is a recyclable object, GC will reclaim the whole group. In this way, each GC runs only reclaims a certain proportion of unreachable objects to ensure smooth operation of the program. The Train GC algorithm is a very good algorithm, and the specific algorithm is shown in reference 4. Three detailed Finalize functions

Finalize is a method located in the Object class. The access modifier for this method is protected, because all the subclasses of Object, the user class is easy to access this method. Because the Finalize function does not automate chain calls, we must implement manually, so the last statement of the Finalize function is usually super.finalize (). In this way, we can implement the Finalize call from the bottom to top, that is, release your own resources, then release the resources of the parent class.

According to the Java language specification, JVM guarantees that this object is unreachable before calling the Finalize function, but JVM does not guarantee that this function will be called. In addition, the specification also guarantees that the finalize function runs up to once.

Many Java beginners will think that this method is similar to the destructor in C , puts a lot of objects and resources in this function. In fact, this is not a good way. The reasons are three, which, GC to support the Finalize function, make a lot of additional work to override the object. Second, after the Finalize is completed, the object may become upacted, and the GC also checks if the object is up. Therefore, using Finalize will reduce the operating performance of GC. Third, since the GC call finalize is uncertain, it is not sure to release resources in this way.

Typically, Finalize is used for some disclosures that are not easy to control, and very important resources, such as some I / O operations, and data connection. The release of these resources is critical to the entire application. In this case, the programmer should be mainly based on the program itself management (including release) resources, supplemented with the Finalize function to form a double insurance management mechanism, and should not rely on Finalize to release resources. .

Next, an example statement is given, and after the Finalize function is called, it may still be up to, and it can also indicate that the Finalize of an object may only run once.

Class myObject {

Test main; // Record the Test object, use to recover the reachability in Finalize

Public MyObject (Test T)

{

Main = t; // Save the Test object

}

protected void finalize ()

{

Main.ref = this; // Restore this object, let this object can be reached

System.out.Println ("this is finalize"); // Used to test Finalize only run once

}

}

Class test {

MyObject Ref;

Public static void main (String [] args) {

Test test = new test (); test.ref = new myObject (test);

Test.ref = null; // myObject object is an irreparable object, Finalize will be called

SYSTEM.GC ();

IF (Test.ref! = null) System.out.println ("My Object is also alive");

}

}

Run results: this is finalizemyObject is still alive

In this example, it is necessary to note that although the MyObject object becomes an accessible object in Finalize, the Finalize is no longer called next time, because the Finalize function is only called one time.

How is the four programs interact with GC

Java2 enhances the memory management function, adds a java.lang.ref package, which defines three reference classes. These three reference classes are SoftReference, WeakReference, and Phantomreference. By using these reference classes, programmers can interact with GC to improve GC work efficiency. The reference intensity of these reference classes is between the reachable objects and the irreparable objects. The intensity of their reference is as follows:

Creating a reference object is also very easy, for example, if you need to create a Soft Reference object, first create an object and use a normal reference method (can be object); then create a SoftReference reference to this object; finally set a normal reference to NULL. In this way, this object has only one Soft Reference reference. At the same time, we call this object for the Soft Reference object.

The main features of Soft Reference are strongly referenced. This type of memory is only recycled when memory is not enough, so they are usually not recycled when memory is sufficient. In addition, these reference objects can also be set to NULL before Java throws OutofMemory exceptions. It can be used to implement a cache of some common pictures, implement the function of cache, to ensure maximum use of memory but does not cause OutofMemory. The use of pseudo code for this reference type is given below;

// Apply for an image object

Image image = new image (); // Create an image object

...

// Using Image

...

// Use the image to set it to the Soft reference type, and release strong references;

SoftReference Sr = New SoftReference (Image);

Image = null;

...

// Next time

IF (SR! = null) image = sr.get ();

Else {

// Due to the low memory, it is released, so it is necessary to reload;

Image = new image ();

SR = New SoftReference (image);

}

The maximum difference between the Weak reference object and the Soft reference object is that when the GC is recycled, it is necessary to check whether the Soft reference object is reclaimed by the algorithm, and the GC is always recycled for the WEAK reference object. The Weak reference object is easier and is recycled by GC faster. Although the GC will renew WEAK objects at runtime, complex relations of Weak objects often require several GC operations to complete. WEAK reference objects are often used in the MAP structure, references objects with large amount of data, once the object's strong reference is NULL, GC can quickly recover the object space. This example is found 4;

The use of PHANTOM is small, mainly used to assist the use of the Finalize function. PHANTOM object refers to some objects, which are completed in the Finalize function, and is irreparable objects, but they have not been recycled by GC. This object can assist Finalize for some later recycling work, and we enhance the flexibility of resource recycling mechanisms by covering Reference's CLEAR () method. Five Java encoding suggestions

According to the working principle of GC, we can make GC operation more efficient and more in line with applications. Here are some suggestions for some programming.

The most basic recommendation is to release the reference to useless objects as soon as possible. Most programmers are using temporary variables that automatically set to NULL after exiting the activity domain (Scope). In this way, we must pay special attention to some complex object maps, such as arrays, queues, trees, and diagrams, etc., which are more complicated between these objects. For such objects, GC recovery they generally less efficient. If the program allows, the reference target that will not be used as soon as possible is NULL. This can accelerate the work of GC. Try to use the Finalize function. The Finalize function is the opportunity to provide the programmakers to the programmaked object or resource. However, it will increase the workload of the GC, so try to recycle resources in Finalize mode. You can use the Soft application type if you need to use the images used. It can save the picture in memory as much as possible, not cause OutofMemory. Note that the collection data type, including data structures such as arrays, trees, diagrams, chains, which are more complicated to GC. In addition, pay attention to some global variables, as well as some static variables. These variables are often easily caused by Dangling Reference, causing waste. When the program has a certain wait time, the programmer can manually execute system.gc (), inform GC operation, but Java language specification does not guarantee that GC will execute. Use an incremental GC to shorten the pause time of the Java program.

Reference

article

Ouyang Chen, Zhou Xin "Java and memory leak" http://www-900.ibm.com/developerworks/cn/java/l-javamemoryleak/index.shtml Y. Srinivas Ramakrishna "Atuomatic Memory Management in the Java Hotspot Virtual Machine " found on Bill Venners Chapter 9 of "Inside the Java 2 Virtual Machine" http://www.artima.com/insidejvm/ed2/ch09GarbageCollectionPrint.html Sun Microsystems, "Java Language Specification, Second Version"

转载请注明原文地址:https://www.9cbs.com/read-60171.html

New Post(0)