Http://blog.9 Press .NET / Chen Sheng 913 / Archive / 2004/08/24 / 83789.aspx http://blog.9 Publishing House .NET / Chen Sheng 913 / Archive / 2004/08/24/83794 .aspx
1. Java has an operator similar to the SIZEOF () in the C language? The surface answer is that Java does not provide an operator that is similar to the C language. However, we should think about why Java programmers need it. C language programmakers manage most data structural storage allocations, and sizeof () is not responsible for understanding the size of the allocated memory block. C Storage Distributor such as malloc (), as long as the object is initialized, almost everything: the programmer must set all the object domains of a deeper one-layer object pointer. But when all are said and encoded, C / C storage assignments are quite effective. In contrast, Java object allocation and construction is closely combined (it is not possible to use an object that has been assigned but no initialization). If the Java class defines a domain as a deeper layer object, it is also common in the construction phase. The Java Object Distributor thus frequently assigns interconnect objects: object graphics. All this is too convenient to collect, all this is too convenient, and you feel that you don't have to worry about the details of Java storage assignment. Of course, this is only valid for simple Java applications. As compared to C / C , the same Java data structure often occupies more physical storage. In business software development, close to the largest virtual storage on the 32-bit JVM today is a universal scalability limit. Therefore, Java programmers can benefit from SizeOf () or other similar functions because these functions can observe whether its data structure is too large or whether to include storage bottlenecks. Fortunately, Java reflections allow you to write this tool too much. Next, I will discuss several mistakes that often appear to this issue. Misunderstanding 1: Because of the size of the Java type, no sizeof () is good, Java Int is 32 digits on all JVMs and all platforms, but this is just a language specification requirement, and this data type can be accepted. Width. This int is basically an abstract data type and can be supported by a 64-bit memory word on a 64-bit device. Non-primary types are no exception: Java language norms do not involve such problems: How to calibrate or Boolean in the physical storage or Boolean can not be implemented as a simple bit vector in JVM. Misunderstanding 2: Serialize the object into a bit amount and then view the generated flux length to measure the size of the object. This method is invalid. This method is that the serial layout is just a remote reflection of the layout in the real memory. For example, by observing how string is serialized: at least 2 bytes per char in the memory, but in the serial format is UTF-8 encoding, any ASCII content only accounts for half of space. Another solution you might think of describing a trick in "Java Tip 130: Do you know your data size: On the basis of creating a large number of tag classes, the carefully measured in the stack size used in JVM growth of. If appropriate, this technique is quite useful. In fact, I also use it in this article to boot the alternate method. Note: The class SIZEO in Java Trusts requires a static JVM (such a stack activity can only be caused by the object assignment of the measurement thread request and the operation of the garbage collection, requiring a large number of object instances. If you want to measure a single large object (may be part of the debug track output) size, especially if you want to test what is actually what makes him become so big, this method is invalid. 2. What is the size of the object? The above discussion highlights a philosophical question: Suppose you often handle the object graphics, what is the definition of an object size? He refers to the size of the size of the object you are measuring or refers to the entire data graphic of the object instance? The latter uses more in actual life.
As you can see, things are not always divided so clearly, but for the launcher, you can refer to the following methods: • The sum of all non-static data fields (including the domain defined in the superclass) in the superclass is its Different size and C , class methods, and their virtual do not affect the size of the object. The hyper interface does not affect the object size (see the notes at the end of the list) • The complete object size can be closed as a closure of the entire object graphic rooted Get a comment: Implement any Java interface only to marks the doubt, and do not add any data to its definition. In fact, the JVM even does not check the interface to implement all the methods requested by the interface: In the current specification, this is strict to the compiler's responsibility. In order to guide the entire process, for the primary data type, I use the SizeOf class of Java Skills 130 to measure physical dimensions. As it is proven, for ordinary 32-bit JVM, a simple java.Lang.Object accounts for 8 bits, and basic data types are usually able to adapt to the least physical size of language requirements (except Boolean to occupy Outside the entire byte): // java.lang.Object shell size in bytes:
Public static final int object_shell_size = 8;
Public static final int jref_size = 4;
Public Static Final Int long_field_size = 8;
Public static final int INT_FIELD_SIZE = 4;
Public static final INT Short_field_size = 2;
Public static final int char_field_size = 2;
Public static final int Byte_field_size = 1;
Public static final int boolean_field_size = 1;
Public Static Final Int Double_field_size = 8;
Public static final int float_field_size = 4;
(These constants are not always hard coding, and for a given JVM, they must measure independently, realize this is important) Of course, childish calculation object domain dimensions often ignore the storage queue in JVM. The storage queue is really a matter (for example, the primary arrangement type in Java TIP 130), but I think it is useless in this low level detail. This detail is not only decided by the JVM developer, but they are also under the control of programmers. Our goal is to obtain the best estimate of the object size, and hope that excess in class domains, the domain should be simply assembled, or there is a need to be more compact embedded databases. For absolute physical accuracy, you can always return to the SizeOf class in Java TIP 130. To help the configuration files that make up the object instance, our tools don't just calculate the size, but also build a useful data structure: graphic consisting of IObjectProfilenode:
Interface IObjectProfilenode
{
Object object ();
String name ();
Int size ();
Int refcount ();
IOBJECTPROFILENODE PARENT ();
IObjectProfilenode [] Children ();
IObjectProfilenode shell (); IOBJECTPROFILENODE [] PATH ();
IOBJECTPROFILENODE ROOT ();
Int pathlength ();
Boolean Traverse (Inodefilter Filter, Inodevisitor Visitor);
String dump ();
} // end of interface
IObjectProfilenodes interconnects with a very similar method with the original object graphics, which uses the IObjectProfileNode.Object () function of the actual object represented by each node. IOBJECTPROFILENODE.SIZE () Returns the overall size (in bytes) of the object subtree with the object instance of the node. If the object instance is linked to other objects via a non-empty instance domain or by a reference included in the arranging domain, IOBJECTPROFILENODE.CHILDREN () will become a corresponding list of subtragraph nodes arranged in descending order. Conversely, IObjectProfilenode.Parent () returns their parent nods for each node that is not the start node. Thus the entire collection of iObjectProfilenode cuts off the original object and the display object is stored in its inside. Moreover, the graphics node is sided from the class domain, detects the node path within the graphic (IObjectProfilenode.path ()) allows you to retrore the ownership link from either of the original object instance to any part of the data. You may have noticed that some of the above paragraphs have a bit vague. If you encounter the same object in the traversal object (eg, how do you point to it in the graphic), how will you assign its ownership (parent pointer)? Consider the following code snippet:
Object obj = new string [] {New String ("javaworld"),
NEW STRING ("javaworld")}
Each java.lang.String instance has a type of CHAR [], the type char [] has a real string content. The String replication constructor is valid in the Java 2 platform standard (J2SE) 1.4, two String instances in the above arrangement share the same containing {'J', 'A', 'V', 'A', 'W ',' o ',' r ',' L ', the CHAR [] arrangement of the' D '} character sequence. Two strings are equal to this arrangement, so what do you do like this situation? If I always want to assign a single parent node to the graphic node, then this problem has no universally applicable answer. But in fact, many such object instances can be traced back to a single "natural" parent node. This natural link sequence is usually shorter than other more circulating paths. It is more dependent on the instance instead of this instance instead of other. Watch the items in the arrangement as more subordinate to this arrangement. Therefore, if the internal object instance can be reached through several roads, we choose the shortest road. If the path is as long as the path, we chose the first one that was discovered. In the worst case, this universal policy is useful. Consider the graphic traversal and shortest path should pay attention to this: Width priority search, this graphics traversal can ensure that the shortest path from the start node to any other reachable graphics node. After doing all these preparations, the following is the standard implementation of this graphic traversal:
Public Static IObjectProfilenode Profile (Object Obj)
{
Final IdentityhashMap Visited = new identityhashmap (); Final ObjectProfilenode root = CreateProfiletree (Obj, Visited,
Class_metadata_cache);
FinishProfiletree (root);
Return root;
}
Private Static ObjectProfilenode CreateProfiletree (Object Obj,
Identityhashmap Visited,
Map metadataMap)
{
Final ObjectProfilenode root = New ObjectProfilenode (NULL, OBJ, NULL);
Final LinkedList Queue = new linkedlist ();
Queue.addfirst (root);
Visited.put (Obj, root);
Final classaccessprivilegedaction caction =
New classaccessprivilegedaction ();
Final FieldAccessPrivilegeDaction Faaction =
New FieldAccessprivilegedAction ();
While (! queue.Isempty ())
{
Final ObjectProfilenode Node = (ObjectProfilenode) Queue.RemoveFirst ();
Obj = node.m_obj;
Final class objclass = obj.getClass ();
IF (Objclass.isaRray ())
{
Final Int ArrayLength = array.getlength (obj);
Final class componenttype = objclass.getComponentType ();
// Add shell pseudo-node:
Final AbstractShellProfilenode shell =
New ArrayshellProfilenode (Node, Objclass, Arraylength);
Shell.m_size = SizeofArrayShell (ArrayLength, ComponentType);
Node.m_shell = shell;
node.addfieldref (shell);
IF (! companyType.isprimitive ())
{
// Traverse Each Array Slot:
For (int i = 0; i { Final Object Ref = array.get (Obj, i); IF (Ref! = null) { ObjectProfilenode Child = (ObjectProfilenode) Visited.get (REF); IF (Child! = NULL) child.m_refcount; Else { Child = New ObjectProfilenode (Node, Ref, New arrayindexlink (node.m_link, i)); Node.addfieldref (child); Queue.Addlast (child); Visited.put (ref, child); } } } } } Else // The Object IS of a non-array type { Final ClassMetAdata metadata = GetClassmetadata (Objclass, MetadataMap, Caection, FaAction); Final Field [] Fields = metadata.m_reffield; // Add shell pseudo-node: Final AbstractShellProfilenode shell = New ObjectshellProfilenode (Node, Metadata.m_primitivefieldcount, Metadata.m_reffields.length); Shell.m_size = metadata.m_shellsize; Node.m_shell = shell; node.addfieldref (shell); // Traverse All Non-Null Ref Fields: For (int f = 0, flimit = fields.Length; f { Final Field Field = Fields [f]; Final Object Ref; Try // TO GET The Field Value: { REF = Field.get (OBJ); } Catch (Exception E) { Throw new runtimeException ("Cannot get Field [" Field.getname () "] of class [" Field.getDeclaringclass () .getname () "]:" E.toTOString ()); } IF (Ref! = null) { ObjectProfilenode Child = (ObjectProfilenode) Visited.get (REF); IF (Child! = NULL) child.m_refcount; Else { Child = New ObjectProfilenode (Node, Ref, New classfieldlink (field); Node.addfieldref (child); Queue.Addlast (child); Visited.put (ref, child); } } } } } Return root; } Private static void finishprofiletree (ObjectProfilenode Node) { Final LinkedList Queue = new linkedlist (); IOBJECTPROFILENODE LASTFINISHED = NULL; While (Node! = NULL) { // Note That An Unfinished Nonshell Node Has ITS Child Count // in m_size and m_children [0] is its shell node: IF ((node.m_size == 1) || (Lastfinished == Node.m_Children [1])))))) { Node.finish (); Lastfinished = node; } Else { Queue.addfirst (Node); For (int i = 1; i Final IOBJECTPROFILENODE CHILD = node.m_children [i]; Queue.addfirst (child); } } Queue.Isempty ()) Return; Else Node = (ObjectProfilenode) Queue.RemoveFirst (); } } The code is the "ATTACK of the Clones" to be distinguished from "Attack of The Clones". As mentioned earlier, it caches reflective metadata to improve performance, and uses an object that identifies a hash map to mark access. The PROFILE () method begins with the original object graphics of the spanning tree of IObjectProfilenode from the width priority, and the fast rear sequence traversal of all node sizes is completed. PROFILE () Returns an IObjectProfilenode, that is, the root generated tree, its size is the size of the entire graph. Of course, the output of Profile () is only useful when I have a good way to extend it. For this purpose, each IObjectProfilenode must support tests that are made together by node visitors and node filters: Interface IObjectProfilenode { Interface inodefilter { Boolean Accept (IObjectProfilenode Node); } // end of nested interface Interface inodevisitor { / ** * Pre-Order Visit. * / Void Previsit (IObjectProfilenode Node); / ** * Post-Order Visit. * / Void Postvisit (IObjectProfilenode Node); } // end of nested interface Boolean Traverse (Inodefilter Filter, Inodevisitor Visitor); ... } // end of interface The node visitor operates only when the accompany filter receives the null or the filter receives the node. For the sake of easy, the child node of the node is only tested when the node itself has been tested. Pre-sequence traversal and back sequence traversal access support. The size of the Java.lang.Object handler and all primary data are set in a pseudo code, which is attached to each "real" node representing the object instance. This processing program node can be displayed in IObjectProfilenode.Shell (), or it can also be displayed in the IObjectProfilenode.children () list: the purpose is to write data filters and visitors, allowing them to be in the same set of instantiated data types Considering primary data. How to implement filters and visitors are your business. As a starting point, class ObjectProfileFilters offers several useful stack filters that help you in node size, node dimensions related to the size of the parent node, node size related to root objects, etc. Up to cut the large object tree. The ObjectProfilervisitors class contains IObjectProfilenode.dump () uses the default accessor, also contains visits that can be created to create an XML dump for higher-level objects. It is also easy to convert the configuration file to SwingTreeModel. For ease of understanding, we created a complete dump of two string arrangements mentioned above: Public Class Main { Public static void main (String [] args) { Object obj = new string [] {New String ("javaworld"), NEW STRING ("javaworld")} IOBJECTPROFILENODE Profile = ObjectProfiler.profile (OBJ); System.out.println ("Obj size =" profile.size () "bytes"); System.out.println (Profile.dump ()); } } // End of class The code is as follows: Obj size = 106 bytes 106-> : string [] 58 (54.7%) -> [0]: String 34 (32.1%) -> String # value: char [], refcount = 2 34 (32.1%) -> 24 (22.6%) -> 24 (22.6%) -> 24 (22.6%) -> [1]: String 24 (22.6%) -> In fact, as mentioned earlier, the internal character arrangement (access by java.lang.string # value) can be shared by two strings. Even if ObjectProfiler.profile () points the slave relationship of the arrangement points to the first discovered string, it still notifies the array sharing (as shown in its next code REFCOUNT = 2). Simple SizeOf () ObjectProfiler.profile () creates a node graphic, which is generally several times that of the original object graphic. If you only need the root object size, you can use faster and more efficient methods ObjectProfiler.sizeOf (), which can be achieved throughout the depth of unstacking. More examples We apply the Profile () and SIZEOF () functions into a pair of examples. JavaString is a reputable storage waste because they are too common, and the efficiency of the usage mode of normal strings is quite low. I believe that you understand that ordinary string series operators typically produce unmacked String. The following code: String obj = "java" new string ("world"); generate the following configuration file: Obj size = 80 bytes 80-> : string 56 (70%) -> String # value: char [] 56 (70%) -> 24 (30%) -> The value character arrangement has 20 char, although it only needs 9. Comparative comparison of it with "java" .concat ("world") or string obj = new string ("Java" new string ("Java" New String ("WORLD" NEW STRING Comparison of the results of ")): Obj size = 58 BYTES 58-> : string 34 (58.6%) -> String # value: char [] 34 (58.6%) -> 24 (41.4%) -> Obviously, if you assign string properties constructed by a series operator or StringBuffer.toTRING () function to many objects (these two cases are actually very relevant), and if you change to use concat () or string replication If you want to improve memory consumption. In order to discuss this problem, I gave a slightly affectionate example, below this visitor / filter checks the object, and reports all the non-compact string inside: Class Stringinspector Implements IOBJECTPROFILENODE.INODEFILTER, IObjectProfilenode.inodevisitor { Public Boolean Accept (iObjectProfilenode Node) { m_node = NULL; Final Object obj = node.object (); IF ((Obj! = null) && (node.parent ()! = null) { Final Object Parentobj = node.parent () .Object (); IF ((Obj.getClass () == char [] .class) && () == String.class)))) { INT Wasted = ((char []) obj) .length - (String) Parentobj) .length (); IF (Wasted> 0) { m_node = node.parent (); m_wasted = m_nodewasted = Wasted; } } } Return True; } Public void Previsit (IObjectProfilenode Node) { IF (M_Node! = NULL) System.out.println (ObjectProfiler.pathname (m_node.path ()) ":" m_nodewasted "bytes Wasted"); } Public void Postvisit (IObjectProfilenode Node) { // do nothing } Int Wasted () { Return 2 * m_wasted; } Private IObjectProfilenode M_Node; Private int m_nodewasted, m_wasted;}; // end of local class IOBJECTPROFILENODE Profile = ObjectProfiler.profile (OBJ); Stringinspector Si = new stringinspector (); Profile.Traverse (Si, Si); System.out.Println ("Wasted" Si.wasted () "BYTES (Out of" Profile.size () ")"); In order to use SizeOf (), let's take a look at LinkedList () vs arraylist (). This code breeds a list of 1000 empty references: List obj = new linkedList (); // or arraylist For (int i = 0; i <1000; i) obj.add (null); IOBJECTPROFILENODE Profile = ObjectProfiler.profile (OBJ); System.out.println ("Obj size =" profile.size () "bytes"); The size of the resulting structure is the store sum of the list implemented. For LINKEDLIST and ArrayList collection, the SIZEOF () reports 20, 040 and 4, 112 bytes respectively. Even if ArrayList grows internal capacity before its size (this will lose almost 50% of the capacity; this is to reimburse the cost of insertion constant), its aligned-based design is much higher than LinkedList ( ) Dual link list implementation, this list implementation creates 20-byte nodes to store each value (this is not said that you should not use LinkedList: They guarantee the performance of unpaid constant insertion, in other things This performance.) Limiting the way ObjectProfiler is not perfect. Another serious problem is that the Java object can share non-static data in addition to the problem of ignoring the storage queue in our previously explained, for example, when the field points to the global Singleton and other shared content, these content can be shared. Take DecimalFormat.getPercentInstance () as an example. Although he returns a new Numberformat each time, all of these Numberformat usually share local.getDefault () Singleton. So, even if SIZEOF (Decimalformat.getPercentInstance ()) reports 1,111 bytes each time, he is estimated too high. This is actually just the performance of another conceptual difficult point in the size measurement of the Java object. In this case, ObjectProfiler.SizeDelta (Object Base, Object Obj) is easy to get: This method traverses the rooted object graphics, and then configures OBJ using accessible objects during the first traversal. Therefore, the results can be effectively calculated as the total size of the data that does not seem to belong to the OBJ owned by Base. In other words, the amount of memory required to instantiate a given OBJ is equal to the existing amount of the base (the shared object has been deleted). SizeDelta (Decimalformat.getPercentInstance (), DecimalFormat.GetPercentInstance ()) Report: 741 bytes of each subsequence format requires 741 bytes, compared to the more accurate 752-byte of the Java Tip 130's SizeOf measurement, appears. The deviation of a small number of bytes is much better than the original sizeof () estimation. Another type of data that ObjectProfiler cannot see is local storage allocation. The result of java.nio.bytebuffer.allocate (1000) is a 1050-byte structure allocated by JVM, but bytebuffer.allocateDirect (1000) looks only to 140 bytes; this is because of true storage is in local storage distributed. At this point you need to give up pure Java, and turn to an analyzer based on JVM analyzer interface (JVMPI). Another fairly vague example of the same problem is: Only 20 bytes report only during the measurement of throwable. ObjectProfiler.SizeOf (New throwable ()), this is 272 bytes reported to the class SizeOf of Java TIP 130. The result of a large phase. The reason is because there is a hidden domain in Throwable: Private Transient Object Backtrace; JVM uses a special way to handle this hidden domain: he does not display in the reflection call, even if it is defined in the JDK source file See it. Obviously, JVM uses this property of the object to store some 250 bytes of local data that supports the stack backtrack.