Performance Analysis and Test of Java List Objects

zhaozj2021-02-16  101

The SDK provides several implementations of an ordered collection interface java.util.List, which is mostly known for people, arraylist, and linkedList. The performance difference of these LIST classes is a problem that is often asked. In this article, I will explore is the performance difference between LinkedList and Vector / ArrayList. To comprehensively analyze the performance differences between these classes, we must know their implementation methods. Therefore, next, I first briefly introduce these types of implementation characteristics from the perspective of performance.

I. Realization of Vector and ArrayList VECTOR and ArrayList have a bottom Object [] array, this object [] array is used to save the element. When accessing the elements by indexing, simply access the internal array of elements by indexing:

Public Object Get (int index) {// First check if INDEX is legal ... This section does not display this part of the code ReturneElementData [index];

The internal array can be greater than the number of elements of the Vector / ArrayList object, the difference between the two as the remaining space to achieve fast add new elements. With the remaining space, the addition element is very simple, just save the new element to an empty position in the internal array, then add the index value for the new space:

Public Boolean Add (Object O) {ENSURECAPACITY (SIZE 1); // Description ElementData [Size ] = O; Return true; // list.add (object) return value}

Slight a bit slightly complicated in any specified location in the set of elements (not the end of the collection): All array elements above the insert point must move a location before, and then assign the value:

Public void add (int index, object element) {// First check if INDEX is legal ... This section does not display this section of code ENSURECAPACITY (Size 1); System.Arraycopy (ElementData, Index, ElementData, Index 1, SIZE - Index); ElementData [index] = Element; size ;}

When the remaining space is used, if you need to add more elements, the Vector / ArrayList object must replace its internal Object [] array with a larger new array, copy all array elements to a new array. According to the different SDK version, the new array is more than the original 50% or 100% (the code below is 100%):

public void ensureCapacity (int minCapacity) {int oldCapacity = elementData.length; if (minCapacity> oldCapacity) {Object oldData [] = elementData; int newCapacity = Math.max (oldCapacity * 2, minCapacity); elementData = new Object [newCapacity] System.ArrayCopy (OldData, 0, ElementData, 0, size);}}

The main difference between the VECTOR class and the ArrayList class is to synchronize. In addition to two methods for serialization, there is no ARRAYLIST method with the ability to perform synchronous execution; in contrast, most methods of the Vector have synchronous capabilities, or direct or indirect. Therefore, Vector is a thread safe, but ArrayList is not. This makes ArrayList to be fast than VECTOR. For some of the latest JVMs, the difference between the two classes can be ignored: strictly, for these JVM, the difference between the two classes is smaller than the time difference displayed by comparing these types of performance. By indexing access and updating elements, the implementation of Vector and ArrayList has excellent performance because there is no other overhead other than the range check. Unless the internal array spatial depletion must be expanded, otherwise add an element to the end of the list or delete an element from the end of the list, it has excellent performance. Inserting elements and deleting elements are always argument to copy (when the array must be extended first, you need to copy twice). The quantity of the replicated element is proportional to the scale, that is, the distance between the insert / delete point to the final index position in the set. For insertion operation, the performance is the worst when the element is inserted to the top (index 0), and when inserted into the last side (after the last existing element), the performance is best. As the assembly size increases, the overhead of array replication has increased rapidly because the number of elements that must be copied each time the number of elements is increased. Second, the implementation of LinkedList

LinkedList is implemented through a two-way link node list. To access the elements by an index, you must find all nodes until you find the target node:

Public Object GET (INTINDEX) {// First check if INDEX is legal ... This section does not display this part of the code entry e = header; // Start node //, look for forward or back, which one of the specific distances / / Near determination if (index index; i--) E = E.PREVIOS;} Return E;

Put the element insertion of the element: Find the node that specifies the index, then insert a new node before the node:

Public void add (int index, object element {// first check if INDEX is legal ... This section does not display this section Entry E = header; // starting node // Find before or afterwards, which one Direction distance // Nearly determine if (index index; I -) E = E.PREVIOS;} Entry, E.PREVIOUS; newentry.previous.next = newentry; newentry.next.previous = newentry; size ;}

Thread secure LinkedList and other collections If you want to get a thread secure LinkedList from Java SDK, you can use a synchronous package from Collections.SynchronizedList (List). However, using a synchronous package is equivalent to joining a indirect layer, it brings expensive performance cost. When the package passes the call to the encapsulated method, each method needs to increase an additional method to call, and the method of the synchronous package package is slower than the unpacking method. The overhead of this indirect call is not very prominent for complex operations like searching, but for relatively simple methods, such as access functions or update features, this overhead may have serious impact on performance. This means that the synchronous packaged LinkedList is a significant disadvantage in performance than VECTOR, as Vector does not need any additional indirect calls for thread safety. If you want to have a thread secure LinkedList, you can copy the LinkedList class and make several necessary methods to synchronize so you can get a faster implementation. This is also effective for all other collection classes: only List and Map have efficient thread security implementation (which is the Vector and HashTable class). Interestingly, the existence of these two efficient thread safety categories is just backward compatible, not for performance considerations. For the performance overhead implemented by index access and update elements, the performance overhead implemented by the LinkedList is slightly larger, as accessing any index requires spanning multiple nodes. In addition to the performance overhead across multiple nodes, there is another overhead in addition to the performance overhead across multiple nodes, that is, create the overhead of the node object. In terms of advantage, LinkedList implemented insertion and deletion operations There is no other overhead, so insert-deleting overhead is almost completely dependent on the insert - delete point is far from the end of the collection. Third, performance test

These classes have many different features that can be tested. The LinkedList application is more frequent because people think that it has good performance when randomly inserted and deleted operations. Therefore, the focus of the following analysis will be the performance of the insertion operation, that is, the construct collector. I tested and compared LinkedList and ArrayList, because both are asynchronous. The speed of the insertion operation is mainly determined by the size of the set and the position inserted. When the position of the insertion point is at both ends and intermediates of the collection, the worst insertion performance and the best insertion performance have the opportunity to appear. Therefore, I chose three insertions (beginning, end and middle), three typical sets: medium (100 elements), large (10,000 elements), super large (1,000,000 elements). In this paper, I use Sun JVM, Java SDK 1.2.0 and 1.3.0 series. In addition, I also tested with HotSpot JVM 2.0, which can be found in 1.3.0 SDK. In the table below, each measurement has been shown in the test time (a unit of 100% in the table) on the TDK 1.2 Vm (a unit in the table). The default JVM configuration is used during the test, which enables JIT compilation, so all JVMs, the heap space must be expanded to avoid internal assault errors. The time recorded in the table is the average time of multiple tests. In order to avoid the impact of garbage collection, I enforce a complete memory cleaning between each test (see the test source code to learn more). Disk monitoring ensures that disk paging will not appear during the test (any test, if it shows a serious disk paging operation, it is discarded). All speeds that show that the speed of the second-second response time is repeated until a significant reasonable time is recorded. Table 1: Construct a medium-size collection (100 elements). The numbers in parentheses are for a collection of predetermined sizes. 1.2 JVM1.3 JVMHOTSPOT 2.0 JVM is always inserted into the beginning of ArrayList 100% (48.0%) 184.9% (152.0%) 108.0% (66.7%) Always insert 135.5% 109.1% 85.3% in the beginning of LinkedList, Also Alit into ArrayList The intermediate 130.0% (40.6%) 187.4% (158.0%) 84.7% (46.0%) are always inserted into the middle of LinkedList 174.0% 135.0% 102.3%. The end of ArrayList is 63.3% (20.7%) 65.9% (25.0) %) 60.3% (29.3%) Always insert 106.7% 86.3% at the end of LinkedList, 80.3%

For the smaller set, ArrayList and LinkedList are very close. When the element is inserted into the end of the collection, the performance of ArrayList has a mutation when adding an element. However, additional elements are ARRAYLIST for its optimization: if you only want a fixed size static collection, the Java array (for example, Object [] has better performance than any set object. In addition to adding operation, the measured time data is not very large, which reflects the level of optimization of each JVM, not something else. For example, for the start of the set (the first two rows of Table 1), HotSpot 2.0 JVM plus LinkedList has the best performance (85.3%), which is 1.2 JVM plus ArrayList (100% ). These two results show that the simple JIT compiler in the 1.2 is highly efficient when performing simple operations such as an iterative and replication array. In the HOTSPOT, the complex JVM plus optimized compiler can improve the performance of complex operations, such as object creation (create a LinkedList node), and can utilize the advantages of code-inlining. 1.3 JVM's results seem to show that its performance has a lot of shortcomings in simple operations, which is likely to be improved in future JVM versions. Here I specially tested another advantage of ArrayList relative to LinkedList, that is, the ability to predetermine the set size. Specifically, when creating ArrayList, you can specify a specific size (for example, ArrayList can be created to have 100 elements), thereby avoiding all overheads with the increase of collections with elements. Table 1 The number in parentheses shows the level of predetermined overall performance performance. LINKEDLIST (until SDK 1.3) cannot be determined in advance. In addition, ArrayList only generates a small amount of objects that need to be garbage collection, that is, the additional internal array objects created when the element's internal array objects are used, and each ArrayList capacity needs to be extended. LINKEDLIST generates a node object for each insert operation regardless of any delete operations that may occur. Therefore, LINKEDLIST will bring considerable work to the garbage collector. Considering these factors, for any small-scale collection, I will choose to use ArrayList instead of LinkedList. Table 2: Constructing a large collection (10,000 elements) 1.2 JVM1.3 JVMHOTSPOT 2.0 JVM is always inserted into the beginning of ArrayList 7773% 7537% 7500% Always insert 100% 90.34% 65.6% of the LINKEDLIST is always inserted into arraylist The intermediate 3318% 3412% 3121% is always inserted into the middle of LinkedList 26264% 14315% 14209% Almost at the end of ArrayList 41.4% 41.2% 37.5% is always inserted into the end of LinkedList 66.4% 73.9% 61.7%

Table 2 shows the test results of large-scale collections. It can be seen that when we have a large-scale insertion operation, we have begun to encounter severe performance punishment. As the result of the achievement of our previous analysis, the worst case for LinkedList occurs when the element is inserted into the middle of the set. In addition, we can also see that inserting the elements into the middle of the integrated intermediate performance than using LinkedList while inserting the elements into the large-specific performance of the set. Compared with the worst cases of these two properties, the performance of the element inserted into the arraylist is obviously much better. In summary, ArrayList once again exhibits better performance, including inserting the element into a random position based on the index. If you always insert an element into a collection, LinkedList has better performance; however, you can use a reverse ArrayList to get better performance, that is, use a dedicated implementation, or The position of the [size -index] mapping flip index in the collection. Table 3: Constructing a large collection (1,000,000 elements) 1.2 JVM1.3 JVMHOTSPOT 2.0 JVM Always inserting the beginning of ArrayList is too long too long. Too long. 100% 179.5% 144.1% is always inserted into arraylist. The middle too long too long is always inserted into the middle of LinkedList. It is always inserted into the end of ArrayList. 38.3% 47.7% 42.9% is always inserted into the end of LinkedList 65.1% 161.5% 139.9%.

Table 3 shows the test results of the large collection, which can be obtained from the table 2 very similar to Table 2. However, Table 3 emphasizes that large collection requires data, collection type, and data processing algorithms just right between the data processing algorithm; otherwise, you will get a real unacceptable performance performance. As for performance optimization, you can construct a dedicated collection class for this issue. For large collection, in order to obtain acceptable performance, constructive collection class is often necessary.

Fourth, the performance of the query

The performance of the query is the highest during the internal implementation of the query. For querying these lists, the time required for iterative all elements is a limiting factor. The query implemented in the ArrayList / Vector class will be iterated for the elements of the class. The following example calculates the total number of empty elements:

INT count = 0; for (int i = 0; i

Table 4 shows that the performance of ArrayList has surpassed LinkedList, which once again shows that ArrayList should be our preferred class. Table 5 shows the time required to use the Listiterator object from List.Listiterator (int) iteration, if the query mechanism cannot be implemented within the List, these iterator is required. ArrayList once again showed a higher performance, but the difference in performance is not a conceivable as shown in Table 4. Note that the absolute time displayed in Table 5 corresponds to 10 times the absolute time in Table 4, ie, ArrayList is more than 10 times faster than ArrayList uses Listiterator iteration.

Table 4: Internal access iteration collection All elements 1.2 JVM1.3 JVMHOTSPOT 2.0 JVMARRAYLIST internal search 100% 106% 197% LinkedList Search 470% 493% 448% Table 5: Table 5: Table 5: All Elements 1.2 JVM1 in the collection of Listiterator traverses. 3 JVMHOTSPOT 2.0 JVM uses Listiterator iteration ArrayList100% 118% 75.2% Util Listiterator iteration ListedList117% 186% 156%

Conclude

The actual measurement and other factors we consider clearly show that ArrayList and Vector have better performance than LinkedList and LINKEDLIST after synchronous packaging. Even if you think LinkedList may provide higher performance, you can also strive to better performance from ArrayList, such as the order of flip collection elements. In some cases, LinkedList will have better performance, for example, when a large number of elements need to be added to the beginning and end of the large collection. However, in general, I recommend that you use the ArrayList / Vector class to use LinkedList only when they have a significant performance problem and LinkedList can improve performance.

转载请注明原文地址:https://www.9cbs.com/read-12196.html

New Post(0)