The architecture of the Java HotSpot performance engine comes from: Sun Charami November 02, 2002 23:36
Architecture of Java HotSpot Performance Engine - White Paper for SUN's Second Generation Performance Technology
Content introduction Overview Architecture Memory Model No Handle Object (Handleless Object) Double-word object header is expressed as object local thread support, including task predecessor and multiple processing technology memory garbage recycling background Description Java Hotspot garbage recycling accuracy relay recycling "Old Object" Recycler Increase "Old Object" Extra Quantity "Non-Pause" Garbage Extra Express Thread Synchronization Java Hotspot Compiler Background Description "Hot Spot" Detection Method Slimming Pairs The impact of software reusability Java local interface (JNI) support 1. Introduction
The Java> TM platform is becoming a mainstream carrier for software development and deployment. In many areas, the Java platform is rapidly growing - from the credit card to large computers, from the web applets to large business applications. Therefore, the quality, maturity and performance of Java technology has become a critical factor for each developer and user. SUN Microsystems, Inc. It is focused on technology that can "lift the rack" in front of many processors and operating systems, and software developers can use Java-based applications, regardless of processors. Running and reliably running with the operating system.
One of the main reasons interested in the Java platform is that Java technology-based procedures are different from programs written in traditional languages, which are distributed in portable and safe form. In the past, the use of portable distributions typically means that performance in program execution has to drop. By adopting modern dynamic compilation techniques, this performance has slowed down, and its essence can be said to be "double-payment".
To give a simple but very important example: We can make a Java technology compiler to generate an optimized machine code for a specific version of the processor "run" (for example, although the Pentium and Pentium II processor can run the same machine code. But no form of machine code can be optimized at the same time. Thus, the word code distribution form of the Java programming language can not only provide portability, but also provide new opportunities for performance improvements.
This article will introduce Java's second-generation performance technology - Java HotSpot Performance Engine. The Java HotSpot performance engine is almost innovative in each area of its design. It uses a wide range of technologies available to improve performance; this includes detecting and accelerating performance-critical codes "in operation" Activity optimization technology. Java HotSpot also provides ultra-fast (ULTRA-FAST) thread synchronization to get the maximum performance of Java technology-based procedures for thread secure; it also provides garbage collectors (GC), GC is not only particularly fast, but also completely "accurate "" It is also more reliable); in addition, the algorithm using the latest technology also reduces or eliminates the suspension of the user's suspension of garbage recovery. Finally, since the Java Hotspot performance engine is written in the source code level in a simple, advanced object-oriented design style, it has further improved maintenance and scalability.
2. Overview
Below is the main structural advantages of the Java HotSpot performance engine:
1) Better general performance No handle object (for improved speed, the reference to the object is implemented as a direct pointer); the faster Java programming language thread is synchronized; to achieve a faster C code to transfer and transfer, C And Java code can share the same activation stack; compared to timely compiling JIT, greatly reduced the total cost of the code space and startup time. 2) Best-Of-BREED Performance to get true local code performance, optimize local code compiler; adaptive "Hot Spot" detection is mainly focused on performance-key code optimization On, there is a great reduction of total compile time and memory requirements for compiled code; the embedded technology has eliminated most dynamic method calls; faster ways to non-embedded methods. 3) Accurate, successively replicating the faster object assignment of garbage collectors; accuracy provides more accurate object recycling (Conservative) or semi-precise (Partially-Accurate) that can cause difficult Except for memory leakage recovery; successive recycling is greatly improved to the recovery efficiency for most programs; for most programs, it is also greatly reduced to recycle "old object" and cause a pause office. The frequency of occurrence; successive recycling is also used for applications using a large number of "live" objects to greatly improve performance scalability; use tag-organizational algorithm to recycle "old" objects, eliminate Memory fragmentation, increasing locality; incremental "suspend" garbage collector for "longevity" object, even for extremely large number of "live" objects in substantially eliminating users in the object recycling process The pause, this is ideal for the wait-sensitive application (such as servers) and big data volumes; 4) Advanced Advanced Design Transparent Debugging and Profile Survey - Java HotSpot Architecture The generation and optimization of local code is completely transparent to programmers, which can provide all profiles and debug information in terms of purely bypatial code, regardless of the optimization methods used inside. 3. Architecture Java HotSpot Performance Engine The architecture has reached the vertices for many years in the laboratory of Sun Microsystems. It combines memory models, garbage collectors and adaptive optimers with the latest technology levels; and it is written in a particularly advanced and object-oriented style. The following sections will introduce important architectural structures and features of the Java HotSpot performance engine.
4. Memory model
4.1 No handle object
Java 2 Software Development Kit (SDK) uses an indirect handle to represent an object's reference. Although in the garbage collection process, this will make the reposition of the object more simple, but this will trigger an important performance bottleneck because most of the access to the Java programming language object requires two levels of indirect Quote. The Java HotSpot performance engine eliminates the concept of handle: The reference to the object is implemented as a direct pointer to provide C-speed access to instance variables. The garbage collector is responsible for in the memory of the memory, when the object is repositioned, find and update all the references to objects in the appropriate position.
4.2 Double Word (Tow-Word) Object Header
The Java HotSpot performance engine uses a two-machine-word pair, not like Java 2 SDK. Since the average Java programming language is small, this technique has an important role in saving space (approximately 8% of the size of the heap). The word of the first object is included in the identity of the Identification Hash code and GC status; the word of the second object head is a reference to the object of the object. Only an array has the third object header field, which is used to represent an array size. 4.3 Representing the mapping data as an object
Class, methods, and other internal mapping data are directly represented as objects on the heap (although these objects may not be directly accessed by Java technology). This not only simplifies the memory model, but allows you to use the same garbage collector as the recycled other Java programming language objects to recycle such mapping data.
4.4 Local thread support, including task preemptive and multi-processing technology
The activation stack of each thread method is represented by a thread model using a host operating system. Java programming language methods and local methods can share the same stack so that fast calls between C and Java programming languages can be allowed. The thread scheduling mechanism using the host operating system can support the full-premier Java programming language thread.
One of the main advantages of threads and scheduling mechanisms using the local operating system is that it can be transparently supporting multiple processing using the local operating system. Since the Java HotSpot performance engine is designed to be unmatisfactory for the competitive state caused by preemptive and / or multiple processing when executing the Java programming language code, the Java programming language thread will automatically utilize any scheduling mechanism provided by the local operating system. And processor allocation strategy.
5. Memory garbage collection
5.1 Background Description
A main charm of the programmer is that it is the first mainstream programming language that provides built-in automatic memory management (or memory garbage recycling). In traditional languages, the explicit allocation / release model is generally used for dynamic memory allocation. It turns out that this is not only one of the most important reasons why the memory leak, the program is wrong, and the program crash written in traditional languages, but also improves the bottleneck of performance, and is the main obstacle to form modular and re-use code (if not Explicit and difficult to understand synergy between modules, determine that the release point is sometimes almost impossible during the module boundary). In Java programming languages, garbage collection is also an important part of the so-called "safe" implementation of the security model to implement this semantic component.
When a garbage collector can "prove" an object is unacceptable to the running program, it can automatically process "release" of the object by reclaiming the object. This automatic processing process not only completely eliminates memory leaks caused by too little release, but also eliminates the criteria crashes and difficulty discovery errors due to the release of too many releasions.
Traditionally, garbage recovery has been considered to be a processing process that is not efficient and causing performance degradation relative to the explicit release model. In fact, using modern garbage recovery technology, it can greatly improve performance, and this performance is actually much better than the performance provided by explicit release.
5.2 Java HotSpot Garbage Recycker
The Java HotSpot performance engine has an advanced garbage collector that fully utilizes the simple and object-oriented design advantages, providing a high-level garbage collection structure framework, which provides a high-level garbage collection structure framework. This framework can be It is easily configured, used or extension to use the new recycle algorithm.
The main features of the Java HotSpot garbage collector will be described below. Overall, the comprehensive outcome of the various techniques used is for a long time to operate the memory leakage and memory irreparable, whether it is not expected to have a long-term operation application that causes memory leakage and memory impergival, whether or not. It is better. The Java HotSpot performance engine can not only provide garbage collector performance with the latest technology level, but also guarantee that all memory recycles and completely eliminate memory fragments.
5.3 Accuracy
The Java HotSpot garbage collector is a full-precision recovery, which is compared to, and many garbage collectors are conservative or semi-precise (Partially-Accurate). Although conservative garbage recovery is prone to a system that does not support garbage collection, it has certain appeal, but it has a certain defect. A conservative garbage collector does not exactly determines the distribution of all objects, and the result is that it must be conservative assumed that those who seem to reference an object (Memory Word) is actually referenced. This means that it can cause some or wrong, for example, misunderstanding a integer is an object pointer; this will cause some negative impact. First, when such an error occurs (actually not common), memory leaks will not be predictively in a way that is substantially renewable (reproducted) or debug (debug) in a reproduction (despite false). Dangling) The crash caused by the object reference can still be prevented, and if there is enough backup memory, the program can still be executed correctly); second, since it may have already caused an error, a conservative recipor must be used The handle is indirectly reference objects (reducing performance), or avoids repositioning objects; because relocation No handle objects need to update all objects of objects, this is not exactly the reference to the reference is a true reference, It is impossible to do it. Can't repositioning the object will result in memory debris, and more importantly, it can hinder the advanced successive replication recovery algorithm described below.
Because the Java HotSpot recovery is fully accurate, it can provide several powerful design guarantees, which is impossible to provide on the conservative recipient: • All irrevable object memory can be reliably recovered;
• All objects can be repositioned, so that object memory can be sorted; this eliminates the fragmentation of the object memory and increases the locality of memory.
5.4 successive copy recycling
The Java HotSpot performance engine uses successive replication recovery with advanced technology. It has two main advantages:
· Compared to Java 2 SDK, most of the program has greatly improved the distribution speed and total garbage collection efficiency (usually 5 times);
· Recently reduce the frequency of "pause" when the user can feel.
The successive recovery uses most objects (usually 95%) in most procedures, which is used as a fact that is used as a temporary data structure, isolates the newly created object to an object "Kindergarten (Nursery "In the successive recovery, one of the following things: First, because in the object kindergarten, the new object is allocated by a stack, so the assignment is particularly fast, because this is only Renewes involving a single pointer and a single check of a kindergarten overflow. Second, when you get to the kindergarten, most of the objects in the kindergarten have been "dead", which makes the garbage collector can only move the very few survival objects in the kindergarten to elsewhere, so there is no need to be in kindergarten. Dead objects do recycle.
5.5 Adopt the "Old Object" recovery of the marker - organizer algorithm
Although successive replication recovers can effectively recover most dead objects, longer life is still constantly stacked in the "old object" memory area. From the perspective of insufficient memory or program requirements, it is sometimes necessary to perform garbage collection of old objects. The Java HotSpot performance engine can use a standard tag-organized recycle algorithm, which starts all of the diagrams of the living object from "root", then scans memory and holds the slits left by the dead object. By organizing the gap in the recovery heap (rather than recycling them into a release list), memory fragmentation can be eliminated; the allocation of the old object will be more reasonable due to the elimination of the release list search.
5.6 increment "no pause" garbage collector
Marking - Corruption The recovery cannot eliminate all users that can feel the suspension of the user, the user can feel the garbage collection is in the "old" object (in the machine term, "live" for a period of time, it is necessary to do garbage collection And this pause is proportional to the data amount of the existing live object. This means that when there is more data being processed, the pause may be anyging; this is a very bad performance for server applications, animations, or other soft real-time applications. The Java HotSpot performance engine provides another old spatial garbage collector to solve this problem. The recipient is full, it eliminates the suspension of garbage collection of users. The incremental recovery can be increased smoothly, even when the extra large object data set is processed, the relatively constant pause time can be provided. This creates excellent performance for the following applications: server applications, especially high-availability applications; dealing with very large "live" objects of data sets; it is not expected to pay attention to Applications such as games, animations, or other high-interactive applications. The non-suspension recovery is used as an incremental old space recycling solution, academically called the "Train" algorithm. The algorithm is to separate the suspension of the old space to many tiny pauses (typical pauses less than 10 milliseconds), and then these tiny pauses are spread over time, so, the actual program is in terms of the user. The like is not paused. Since the train algorithm is not a hard-real time, it cannot guarantee the upper limit of the number of suspensions. However, actually a large amount of suspension is extremely rare, and they are not directly caused by large data sets.
As a very welcome by-product, no suspension recovery can improve memory locality. Since this algorithm attempts to be closely "coupled" object groups to the adjacent memory regions, it can provide the best memory paging and cache local properties for these objects. This is also very beneficial for multi-threaded applications that operate different object data sets. 6. Super fast thread synchronization
Another important tempting of the Java programming language is that it provides a language-class thread synchronization. This makes a multi-threaded program with fine thread synchronous locking. Unfortunately, the current synchronization implementation relative to micro operation in other Java programming languages, the efficiency is very efficient, which makes fine synchronous operations into performance, main bottlenecks.
The Java HotSpot performance engine has made breakthroughs in the synchronization implementation of threads, which greatly promotes the improvement of synchronous performance. The result is that the synchronization performance is so fast, so that it is not an important performance problem for the programs of most real-world worlds.
In addition to the benefits of space in the "Memory Model" section, the synchronization mechanism provides ultraast and constant to all uncomperable synchronization (it dynamically consists of most synchronous) (Constant) -time) performance and thereby also provides its performance benefits.
Java HotSpot synchronization is fully suitable for multiple processing and should display excellent multiprocessor performance.
7. Java HotSpot compiler
7.1 Background Description
Java programming languages are a new programming language with unique features. To date, most attempts to improve their performance have focused on how to apply for traditional language development. Timely compiler is the basic fast traditional compiler, which can be "running" to translate the Java byte code to the local machine code. Timely compilers run on the machine's actual execution of the field code, and compile each of the first executed methods.
There are several problems in JIT compilation. First, since the compiler is running on the "user time" machine, it will be strictly limited by the compilation speed: if the compilation speed is not particularly fast, the user will feel in the start of the program or Part of the obvious delay. This has to take a compromise, which will be difficult to optimize with this compromise, which will greatly reduce compilation performance.
Second, even if JIT has time to perform full optimization, such optimization is compared to the Java programming language, it is more than the optimization effect of traditional languages such as C and C . This has the following reasons: Java programming languages are dynamic "secure", its meaning is to ensure that the program does not violate the language semantic or directly access non-structured memory. This means that dynamic type testing must be performed frequently, for example, when the transformation is when it is stored and stored to the object array. The Java Programming Language assigns all objects on "Heap", and in C , many objects are allocated on stacks. This means that the Object assignment efficiency of the Java programming language is much higher than that of C objects. In addition, since Java programming languages are garbage recovery, it has more different types of memory allocation overhead than C (including potential garbage cleaning) and written-isolated (Write-Barrier). In the Java programming language, most ways call is "virtual" (potential polymorphism), which is rare in C . This not only means the performance of method calls more important, but it means that it is more difficult to perform static compiler optimization (especially the inline method). Optimization). Most traditional optimizations are most effective between calls, while the reduced call between the Java programming language can greatly reduce this optimization efficiency because they use smaller code segments. Java-based procedures can be "running" to change due to its powerful dynamic class. This makes it particularly difficult to carry out many types of global optimization because the compiler must not only detect these optimizations due to dynamic loading, but must also be able to release and / or reverberate these optimizations during program execution. Will not damage or affect the execution semantics based on Java technology (even if these optimizations involve activity methods on the stack). The result of the above problem is to make any attempts to obtain advanced performance of Java programming languages must seek a non-traditional solution rather than blindly applying traditional compiler technology. The architecture of the Java HotSpot performance engine solves the performance issues of the Java programming language proposed above by using adaptive optimization techniques. The adductive optimization technology is the research results of SUN's research institution Self Group has achieved research results in object-oriented language.
7.2 Hotspot Hot SPOT Test
Adapted optimization technology uses the interesting properties of most programs to solve the JIT compilation problem. In fact, all programs have spent most of their time and implementing a small part of the code in them. The Java HotSpot performance engine is not compiling throughout the program when the program starts, but immediately uses the interpreter (Interpreter) when the program starts, and analyzes the program in the operation to detect the key in the program. "Hot Spot", then set the global local code optimizer to these hotspots. The Java HotSpot compiler will pay more attention to the performance critical part of the program by avoiding compilation (most of the program), the Java HotSpot compiler focuses on the performance critical part of the program and thereby does not have to increase the total compile time. This dynamic monitoring is continuously carried out as the program is running, and thus it can accurately "run" to adjust its performance to accommodate the needs of the user.
A clever and important benefit of this method is that after the compilation is delayed to the code ("for a while" means the machine time, not the user time!), So that information can be collected in the process used. And use this information to make more intelligent optimization. In addition to the hotspot information in the collection program, other types of information is also collected, such as the caller with the "virtual" method call - the correlation data of the caller, etc.. 7.3 Method
The appearance of the "virtual" method call in the Java programming language in the "Background Description" is an important bottleneck that hinders optimized bottlenecks. When the Java HotSpot Adapted Optimizer is in the execution process, once the information about the "Hotspot" is reclaimed, it can not only compile these "hot" into local code, but also execute large quantities on these code. Methods.
Embedding has an important benefit. It greatly reduces the dynamic frequency of the method call, which saves the time required to perform these method calls. More importantly, the embedding is a much larger code block for the optimizer. This state can greatly improve the efficiency of the optimization technique of conventional compilers, eliminating an obstacle to improve Java programming language performance.
The optimization of the embedded code has enhanced the efficiency of optimization. As the Java Hotspot compiler is further mature, the ability to operate more embedded blocks will make more advanced optimization techniques possible.
7.4 Dynamic Inverse Optimization
Although the above-described embedding is a very important optimization method, this is still very difficult to implement for dynamic object-oriented programming languages like a Java programming language. In addition, although the methods that detect "hotspots" and embedded them are very difficult, it is still not enough to provide semantics in all Java programming languages. This is because programs written in Java programming languages not only "change the mode calling mode" not only "in operation", but also dynamically load new Java code for a running program.
The embedded is based on global analysis, dynamic loading makes the embedded more complicated because it changed a global relationship between a program. A new class may contain new methods that need to be embedded in place. Therefore, the Java HotSpot performance engine must be able to dynamically reverse optimization (if needed, then re-optimize) "hotspots" that have previously optimized, even in the implementation of the "Hot" code. Without this ability, the general embedding will not be implemented safely on Java-based programs.
7.5 Optimization Compiler
Only performance critical code is compiled, which "purchases time" and can be used for better optimization. The Java HotSpot performance engine uses a fully optimized compiler to replace relatively simple JIT compiler. The full optimization compiler can perform all the first current optimizations. For example: death code deletion, lifting cycle non-variable, general sub-expression deletion, and continuous conveying (constant propagation). It also gives an optimization of certain Java technologies. Such as: empty-check and value domain - Range-Check delete, etc. Register Allocator is a global graph that represents a distribution program with a color, which takes advantage of large register sets. The portable transplant performance of the Java HotSpot performance engine is very portable, which relies on relatively small machine description files to describe all aspects of the target hardware. Although the compiler uses a slower JIT standard, it is still much faster than traditional optimization compilers. Moreover, the improved code quality is also a "return" that saves the time saving the number of compiled code.
7.6 small knot
In summary, we can make the role of the Java HotSpot adaptation optimizer as follows: In general, the program is started faster. This is because, compare less than the JIT compiler. The compilation process expands over time, so that the compilation pause is shorter, and it is not noticed by the user. Only the practice of compiling performance critical codes "purchase time", allowing these times to perform better optimization. Due to less memory required to compile, there is less memory required. By making a long wait time before compiling code, information can be collected to perform better optimization, such as embedding, this technology will have far-reaching significance. By highly optimizing the performance of the key code, the running speed of important code is faster. 7.7 A main advantage of object-oriented programming language for software reuse (Reusability) is to increase development productivity by providing a powerful language mechanism for software reuse. In fact, this reusability is rarely available. Because of a lot of use of these mechanisms, it may greatly damage performance, so programmers must use them carefully. An amazing side effect of Java HotSpot technology is that it greatly reduces this performance damage price. We believe that this will have an important impact on the development method of object-oriented software, which allows each company to fully use object-oriented reusability mechanisms without damaging their software performance.
Examples of this role are readily available. A survey results for programmers who use Java programming languages will clearly indicate that many programmers avoid using full "virtual" methods to simultaneously avoid writing larger methods. Because they are convinced, the calls of each virtual method will result in a decline in performance. At the same time, the "virtual" method (that is, the fine use of non-"static" or "Final" in the Java programming language is particularly important for the construction of the highly reusable class, because each such method is As an "exception branch (hook", it allows new subclasses to modify superclars.
Since the Java HotSpot performance engine can automatically embed most of the virtual method calls, the degree of performance decline is greatly reduced, even in many cases, all eliminated.
No matter how it emphasizes this importance of this role. Because of the use of important reusability mechanisms, we can greatly change the trade-off relationship between performance, which has the potential to fundamentally change the preparation of object-oriented code. In addition, with the maturity of object-oriented programming methods, there is a significant trend toward more subdivided objects and more subdivided methods. These two trends are intended to increase the frequency of the virtual method call in the future code style. As this advanced code style is popular, the advantages of Java HotSpot technology will be more obvious.
8. Java Local Interface (JNI) Support
The Java HotSpot performance engine can support the local approach with a standard Java local interface (JNI). Previously, the local approach written in JNI is upwardly compatible in the source code and binary code format. The initial local method interface will not be supported (JNI is partially introduced, because the old interface does not provide binary compatibility for the local method DLLS).