Java (JVM) virtual machine structure foundation, and JAR file package and JAR command detailed

zhaozj2021-02-16  108

Some time ago, due to the performance of the original String and StringBuffer, I did a few small experiments and got some conclusions, but from the netizen's reaction, the study did not play the purpose, and netizen also It is very pertinent to put forward his own opinions and point out some of the defects in the experiment. I repeatedly compile the code for comparison, but several netizens are still not very convincing, and the results of the last experiment and The conclusion that the anti-compilation is not completely consistent, because the contrast of the anti-compilation code is basically based on the state of statement, so this contrast is indeed convincing, but this will guide me the next action guide: study JVM instructions And the JVM structure, with a complete understanding of the anti-compiletable code to give conclusions that may make people convincing. Some articles that will be written in this article are some of my experience in studying JVM norms. I hope to conduct our next round of in-depth research on the basis of understanding you. Ok, gossip less, start our body. The object of JVM is a very familiar class file. We also call the format of this compilation completed by the JVM specification defined by the JVM specification (although not forced to be a practical file) is very detailed, but we only say it here Some macro content, have the opportunity to study the details of the details. The format of the class file required by the JVM is a binary format that is independent of the hardware and operating system. It is precisely defined the class or interface representation, which even contains details such as byte order, and byte order at a specific platform. The file format is generally fixed and will not be described. The data type supported by the JVM and almost the same, please pay attention to it is almost the same! That is, the original type and reference type, they can be stored in the variable table, or as parameter delivery, the method returns, and more typically is the object of operation. Why is it not exactly the same as the Java language specification? Because there is a raw type of Java language in the JVM: Return AddedDress Type. This type is JSR, RET, and JSR_W instructions to be used, and its value is a pointer to the operand of the JVM instruction, and its value is modified by the program that cannot be run. In addition, it is necessary to mention the value of the Boolean type, although it is a completely independent value in the Java language, but only provides limited support to it in the JVM, in: No single operation Boolean directive, source code The Boolean operation of the boolean type is operated as an int type. JVM directly supports the Boolean array. The NEWARRAY instruction can create a Boolean array, and its access and modification operation is performed using an operational instruction of an array of Byte types: Baload, Bastore. (In JDK1.0, 1, 1, 1, 1, 1, 1, 1, 1, the Boolean array is encoded as a BYTE array, each element is 8 bits) JVM uses 1 to True, represents 0 represents FALSE, the compiler will map the Boolean type in the source code to The int type in the JVM, and must be consistent with the JVM requirements. In addition, the Data of the floating point type in the JVM specification has a large number of instructions. I have not seen it, mainly to discuss the floating point of JVM and the relationship between IEEE 754. One of the types of types is required to mention the type check. JVM expects almost all type checks that have been completed before running (usually checked by the compiler) without using JVM yourself to check.

The value of the original type does not need to be tagged or is checked at runtime to determine their type, and they do not have to distinguish between the reference type value, the distinction is completed by the JVM instruction set, and the JVM instruction set is different. The instructions distinguish the type of the value it wants to operate, such as IADD, LADD, FADD, and DADD are all JVM instructions for adding two numbers and generating digital type results, but each instruction is for a specific type, respectively. Corresponde, Long, Float, and Double. JVM contains explicit support for objects. Category is a class instance of dynamically allocation or an array. The reference type in the JVM is a reference to an object. The value of the reference type can be imagined to be a pointer to the object, and an object may exist in a number of references to it, the object is always By reference, it is operated, transmitted or tested. For reference types, it is necessary to mention that about NULL, it is originally not runtime type, but it can be converted to any type, and for NULL, JVM does not require any specific value to correspond. To finish these, we started to enter the part of the most wanted to learn when I learned JVM. Everyone can play a spirit. JVM defines several data area (Data Area), including: PC register, JVM stack, heap, method area, runtime constant pool, and native method stack (Native Method Stacks), these data areas can be divided into two according to their survival period, one is the same as the YVM's survival period (including a heap and method area), the same amount of survival of the thread (other), and JVM survival The same data area is created when the JVM is started and is destroyed when the JVM exits, and the data area of ​​the thread surplus is one of each thread. They are created when the thread is created, and the thread is destroyed. It was destroyed. Since JVM can simultaneously support running multiple threads, each thread necessarily requires their respective PC Counter registers, no matter what point, each JVM thread can only perform one method at a time, the method is The current method of the thread, if the method is not a native method, then the PC register is saved is the address of the current command (JVM instruction). If the current method is the native method, the value of the PC register is not defined. The size of the PC register of JVM is large enough to accommodate a ReturnAddress type or a local pointer to a specific platform. Each JVM thread also has a private JVM stack, which stores frames (the next article will say). The stack in the traditional programming language like C is similar, which saves local variables and parties, and also serves as some duties in the method calls and returns. Since the JVM stack cannot be operated directly in addition to the pressing and pop-up operation, the frame may be assigned on the heap. If the required JVM stack is greater than the allowed size in a thread, the JVM will throw the StackOverFlowError error. If the JVM stack can be dynamically telescopically, if it is necessary, there is not enough memory available or not enough memory as a new The thread creates a JVM stack, and the JVM throws the OutofMemoryError error. JVM has only one heap shared for all threads, all class instances and arrays are created in the heap. The object stored in the stack is reclaimed by an automatic storage management system (which is the garbage collector (GC) we are well known. Objects cannot be explicitly released, and JVM assumes that there is no specific type of automatic storage management system, and storage management technology can choose according to the system requirements of the implementation. If the calculated memory stack is greater than the size of the automatic storage management system, the JVM throws the OutofMemoryError error.

JVM has only one method area shared by all threads, the method area is similar to the memory area of ​​the compiled code of the traditional language or the text segment of the UNIX process. It stores class structures, such as runtime constant pools, members, and method data, and methods of constructing (including initialization for classes and instances, and) specific methods for interface types (these specific methods will be said)). Although the logical method area is part of the pile, the simple implementation of JVM can choose to collect or compress the method area (by the author's understanding is the class cannot be uninstalled). The latest version (Second Edition) JVM specification does not require the location of the method area or manages the policy of compiled code. If the memory of the method area cannot meet an allocation request, the JVM throws OutofMemoryError. Running the constant pool is the runtime representation of the constant pool table in the class file, which contains several constants, and the scope must be parsed by the known digital constants to runtime. The function of running the timing pool plans is similar to the symbol table in the traditional programming language, but it is more than the typical symbol table. Each of the JVM method area is assigned from the JVM method area for each runtime, which is created when the runtime or interface of the interface is created when the JVM is created. When creating a class or interface, if you create more content in the memory ratio of the memory ratio of the runtime constant pool, JVM will throw outofmemoryError. More detailed explanations may be more detailed after more content created by the constant pool. JVM implementations may use traditional stacks (more often the C stack) to support native methods (not using Java language methods), the native method stack can also be used in languages ​​like C language. JVM The command set implements the parser. For the JVM implementation that cannot load the native method and does not rely on the traditional stack, it does not provide a native method stack, if provided, the native method stack usually for each thread when it is created. Allocation (in the author's understanding should be a thread that requires the native method). If the memory required by the thread calculation is larger, the JVM will throw a StackOverFlowError error. If the stack can be dynamically expanded, it is not enough memory when it is necessary, or there is not enough memory. The content is used to create a native method stack, and the JVM will throw OutofMemoryError. For these data regions above, the JVM specification allows them to be fixed, or may be dynamically expandable depending on the calculation, if it is a fixed size, the size can be independent when creating. The implementation of the JVM can give programmers or users to provide the initial size of the JVM stack. Similarly, in the case of dynamic retractable, the maximum size and minimum size can be controlled, and the memory space they are used may not be continuous.

By detailed introduction to the method of the JVM stack, some insideracts executed. The frame is usually used to store data and part of the result, and is also used to perform dynamic links, return values ​​of the return method, and distribution exceptions. The frame is created when the method is called, and it is destroyed when the method is completed. It is assigned to space in the JVM stack that creates its thread, each frame has its own local variable array, an operand stack, and a reference to the class of the class of the class of the class of the class. Its local variable array and the size of the operand stack is determined when compiling, and it is provided with the code of the method it accepts, so its data structure is only dependent on the implementation of JVM and Method can also be assigned by the method. For the method being executed, only one frame is active, this frame is the so-called current frame, its method is the current method, the class where the current method is located is defined as the current class. The operation of local variables and operands is usually related to the current frame. If the method where one frame is located, additional method or method is completed, then the frame is no longer a current frame. If it is called another method, then a new frame will be created and become a current frame when controlling to a new method; if it is the end, if there is a method returns, the current frame passes the result of its method call to the front A frame, the current frame is discarded when a frame is a current frame. It should be noted that the frame created by a thread is partially on the thread, and other threads cannot reference it. Each frame contains an array of variables, which is a partial variable array we are well known. A local variable can save a Boolean, Byte, CHAR, SHORT, INT, FLOAT, reference, or returnaddress values, a pair of local variables to save a long or double value. The local variable is addressed according to the index, and the index of the first part variable is 0. If an integer value is between 0 and the length of the local variable array and it is only the index of the local variable array when it is in this section. The value of the long-type or Double type occupies two consecutive local variables, such values ​​may only be addressed using the smaller index value, for example, the Double variable value of index N in the local variable array actually occupies N and N 1, but the local variable N 1 is not read, it can be written, but this will make the content of the local variable N invalid. JVM does not require N to be an even number, which means that the Double and LONG type do not have to be 64-bit in the local variable array, and the implementation of the JVM can determine the value using the appropriate way. JVM uses the parameters called by the local variable transfer method, for the class method call (that is, the Static method), all parameters are continuous storage in the local variable table and start from 0, and all the parameters are also Continuous but starting from 1, the local variable 0 is stored in the reference to the class instance where the instance method is located. Each frame contains a backward first stack, which is its operand stack. The operand stack is empty when it is just created. The JVM provides instructions to load constants or values ​​to the stack from local variables or members. Other JVM instructions extract operands from the operand stack, operate them and put the result back the operand stack. . The operand stack is also used to prepare the parameters pass to the method and the result of the receiving method. For example, an IADD instruction adds two int values, which requires its previous instruction to add two values ​​to the top of the operand stack, which removes the two values ​​from the operand stack. Add and put the result back the operand stack. Subcutors may be nesting in the operand stack, and the resulting value can be embedded. Each of the operand stack can save any type of value of the JVM, including long and double type. The value in the operand stack must be operated according to its type. The following cases are impossible: pressing two int values ​​and subsequent operations as a LONG type or pressing two float values, the subsequent operation is an IADD instruction (the operation object of the instruction is two INT type) ).

A small portion of JVM instructions (such as DUP and SWAP) perform values ​​of runtime data area as the raw value without considering its type, these instructions are in one cannot be used to modify or decompose individual values. The way the way to operate, these limitations are enforced by class file verification. At any time operand stack has its respective depth, the value of the long or Double type is two units and other values ​​are one unit. Each frame contains a reference to the current method of the current method of the runtime constant pool to support the dynamic link of the method code. The method code in the class file code refers to the modified method and the variable that can be accessed through the symbol reference. The dynamic link will reference these symbolics to the specific method reference. Load the class as necessary to parse undefined symbols and Access variables is translated into the appropriate offset of the runtime position of those variables. The advanced binding of the method and variable makes other classes used by the method to destroy the probability of this code smaller. If the method call does not cause an exception (whether the JVM is thrown or the code explicitly thrown) is considered to be the normal end of the method. If the current method calls are ended normally, then a value may be returned to the method called it. In this case, the current frame is used to recover the status of the caller, including its local variable and operand stack, and appropriate add-in program counters to skip the method call instructions. The execution of the program of the frame where the caller is located, if there is a method returns, the return value is pressed into the frame of the operand stack. If the execution of a JVM instruction in the method causes the JVM throw an exception and that exception is not processed in the method, it will cause the method call to suddenly end. Execute an ATHROW instruction can also cause an exception that is explicitly thrown and if that one Exceptions are not captured by the current method, can also cause the method call to suddenly end, and a sudden end method call never returns a value to its caller. A frame may be extended to implement a particular information related to implementation like debugging information.