Java Theory and Practice: Hash

xiaoxiao2021-03-06  132

Each Java object has a hashcode () and equals () methods. Many classes ignore the default implementation of these methods to provide a deeper semantic comparability between object instances. In the part of Java concept and practice, Java Developers Brian Goetz introduced you to rules and guidelines that should be followed by the Java class to effectively and accurately define havehcode () and equals ().

Define the equality of objects

The Object class has two ways to infer the identity of the object: equals () and hashcode (). In general, if you ignore one of them, you must ignore these two because there must be a crucial relationship that must be maintained. Special circumstances are based on the equals () method, if the two objects are equal, they must have the same HashCode () value (although this is usually not true).

The semantics of the specific class Equals () are defined on the left side of the Implementer; define what is equally equally equally, what is part of its design work. The default implementation of Object provides the following equation:

Public Boolean Equals (Object Obj) {Return (this == Obj);

In this default implementation, only the two references are equal when they reference the true same object. Similarly, the default implementation of HashCode () provided by Object is generated by incorporating the memory address of the object. Since some architectures, the address space is greater than the range of int values, two different objects have the same HashCode (). If you ignore HashCode (), you can still use the System.IdentityHashCode () method to access such default.

Ignore Equals () - Simple instance

By default, equals () and havehCode () are reasonable based on identification implementation, but for some classes, they wish to relax the definition of equation. For example, an Integer class definition equals () is similar:

Public Boolean Equals (Object Obj) {

Return (Obj InstanceOf Integer

&& intValue () == ("(Integer) .intValue ());

}

In this definition, the two Integer objects are equal in the case where the same integer value is included. Combined with Integer that will not be modified, this makes it possible to use Integer as the keyword in the HashMap is practical. This value-based Equal method can be used by all original package classes in the Java class library, such as Integer, Float, Character, and Boolean and String (if the two String objects contain the same order characters, then they are equal). Since these classes are unmodified and have havehCode () and equals (), they can be used as a good burst key.

Why ignore equals () and havehcode ()?

What happens if Integer does not ignore equals () and havehcode () situation? If we have never used Integer as keywords in HashMap or other hash, what does not happen. However, if we use such an Integer object as a keyword in HashMap, we will not be able to reliably retrieve relevant values ​​unless we use an extremely similar Integer instance in the PUT () call in the GET () call. This requires that an instance of Integer objects corresponding to a particular integer value can be used in our entire program. Needless to say, this method is extremely inconvenient and error is frequent.

Object's Interface Contract requires that if both equals () is equal, they must have the same HashCode () value. When its identification capacity is included in equals (), why do we need hashcode ()? Hashcode () method is purely used to improve efficiency. The Java platform designer is expected to have an importance of a list-based collection class in a typical Java application, such as HashTable, Hashmap, and HashSet, and use equals () to compare more objects in comparison. All Java objects can support HashCode () and combine with a hash-based collection that can be effectively stored. Implement Equals () and HashCode () requirements

These restrictions are raised in the Object file in the Object file. Especially equals () methods must display the following properties:

Symmetry: Two references, A and B, a.equals (b) if And Only if B.Equals (a)

Reflexivity: All non-air references, A. Equals (A)

Transitivity: if a.equals (b) and b.equals (c), THEN A. Equals (C)

Consistency with hashcode (): Two equal objects must have the same HashCode () value

The Object specification does not clearly require equals () and havehcode () must be consistent - their results will be the same in subsequent calls, assuming "If you do not change any information used in the syndrome comparison." This sounds "The results of the calculation will not change unless the actual situation is true." This fuzzy declaration usually interprets the equality and hash value calculation should be the confirmability function of the object, not other.

What does the object level mean?

It is easy to meet the requirements of Equals () and Hashcode () of the Object class specification. Decide whether to neglect Equals (), except for judgment, it also requires other. In a simple unparalleled class, such as Integer (in fact, almost all unmodified classes), select quite obvious-equivalent should be based on the equivalent of the basic object state. In Integer case, the only state of the object is a basic integer value.

For modified objects, the answer is not always so clear. Is Equals () and havehcode () should be based on object's identity (like default) or object status (icon integer and string)? There is no simple answer - it depends on the plan of the class. For containers such as List and Map, people argue. Most classes in the Java class library, including container classes, errors appear to provide Equals () and havehcode () implementation based on object status.

If the HashCode () value of the object can be changed based on its status, we must pay attention when using such objects as a key based on the hash, make attention to ensure that when they are used as a hash keyword, we don't Allow changes to their status. All hash-based collection assumptions that it does not change when the hash value of the object is used as a keyword in the collection. If its hash code is changed when the keyword is changed, some unpredictable and easily confusing results will be generated. This is usually not a problem during practice - we don't often use the modified object like List as a keyword in HashMap.

An example of a simple modified class is Point, which defines Equals () and hashcode () based on status. If the two Point objects reference the same (x, y) coordinates, the POINT has a hash value from the IEEE 754-bit of the X and Y coated values, and they are equal.

For complicated classes, equals () and havehcode () behavior may be even affected by SuperClass or Interface. For example, the List interface requires that if there is only another object being List, and they have the same order of Elements (Object.equals () defined by Element), the List object is equal to another object. Hashcode () needs more special -List's hashcode () value must meet the following calculations: hashcode = 1;

Iterator i = list.iterator ();

While (I.hasNext ()) {

Object obj = i.next ();

Hashcode = 31 * hashcode (OBJ == NULL? 0: obj.hashcode ());

}

Not only the hash value depends on the content of the List, but also specifies a special algorithm that combines the hash values ​​of each ELEMENT. (String class specifies a similar algorithm for calculating String's hash value.) Write your own equals () and havehcode () methods

Ignore the default equals () method is relatively simple, but if you don't violate the Symmetry or Transitry, ignore the negligible equals () method is extremely difficult. When ignoring Equals (), you should always include some Javadoc comments in equals () to help users who wish to extends your class.

As a simple example, consider the following classes:

Class a {

Final B Somennullfield;

C sometherfield;

Int SomenonStatefield;

}

How should we write the equals () method of this class? This method is suitable for many cases:

Public Boolean Equals (Object Other) {

// NOT STRICTLY Necessary, But Offen a Good Optimization

IF (this == other)

Return True;

IF (! (Other InstanceOf A)))

Return False;

A Othera = (a) Other;

Return

(Somenonnullfield.equals (Othera.Somenonnunnullfield))

&& ((Some Portherfield == Null)

? Othera.SOMETHERFIELD == NULL

: Some Portherfield.equals (Othera.SOMEOTHERFIELD))))))))

}

Now we define equals (), we must define HashCode () in a unified approach. A unified but not always effective definition hashcode () method is as follows:

Public int.com; {return 0;

This method will generate a large number of entries and significantly reduce the performance of HashMaps, but it meets the specifications. A more reasonable HashCode () implementation should be like this:

Public Int hashcode () {

INT has = 1;

Hash = hash * 31 somenonnullfield.hashcode ();

Hash = hash * 31

(someotherfield == NULL? 0: sometherfield.hashcode ());

Return hash;

}

Note: Both implementation reduces the equals () or HashCode () method of the class status field. Depending on the class you use, you may want to reduce the costclass's equals () or a HashCode () function part of the computing power. For the original field, there is a Helper feature in the relevant package class, which can help create a rated value, such as float.floattointbits. Write a perfect equals () method is unrealistic. Typically, when an instantiable class that is ignored by itself, ignoring Equals () is unrealistic, and writes the equals () method that will be ignored (such as in the abstract class) is different from the specific class. Equals. ()method. For more information on examples and description, see Effective Java Programming Language Guide, Item 7 (Reference).

Room for improvement?

It is a very sensible design compromise that constructs the hash method to the root object class of the Java class library - it makes it easy and efficient to use a hash-based container. However, many criticisms have been put forward on the method and implementation of the hash algorithm and object level in the Java class library. The hash-based container in Java.util is very convenient and easy to use, but may not apply to applications that require very high performance. Although most of them will not change, they must consider these factors when you design a hash-based container efficiency application, including:

Too small hash scope. Use int instead of long as the return type of HashCode () has increased the probability of a hash conflict.

Worse hash value assignment. Short strings and small Integers have the hash value of their own small integers, close to the hash value of other "neighbor" objects. A hash function of a torque (Well-behaved) will allocate a column value more evenly within the hash range.

Unselected hash operation. Although certain classes, such as String and List define a hash algorithm that combines its ELEMENT's hash value to a hash algorithm, the language specification does not define the combination of multiple objects to new hashes Any approval method in the value. We are very simple to write on the list, string, or instance class A discussed in the previous equals () and havehcode () methods, but the arithmetic is far from perfect. The class library does not provide any of the end of the hash algorithm, which simplifies more advanced HashCode () creation.

It is difficult to write equals () when the extension has ignored the Instantiable class of equals (). When the extension has ignored the instantiable class of equals () (), the "obvious" "obvious" that defines Equals () cannot meet the symmetric or transfer of the equals () method. This means that when equals () is ignored, you must understand the structure of the class you are extension and implement detail, and even need to expose the confidential field in the basic class, which violates the principle of object-oriented design.

Conclude

By unifying Equals () and hashcode (), you can improve the use of keywords based on the hash-based collection. There are two ways to define the equivalence of the object: Based on the logo, it is the default method provided by Object; based on the state, it requires ignoring equals () and hashcode (). When the status change of the object changes, it is sure that when the status is used as a hash key, you do not allow more constant changes.

转载请注明原文地址:https://www.9cbs.com/read-126982.html

New Post(0)