Sunday, January 15, 2012

Overriding hashCode()


In the earlier post I have written about overriding equals().  A major contract of overriding equals() is, when you override equals() you have to override hashCode().  It is like they both complete each other.  It is of no using having one and not the other.  This may lead to inappropriate behaviour of the code.  The contract says:
1.      During the execution of an application the hashCode() should return consistent value no matter how many times it is called.  The integer may not remain same from one execution to another.
2.      If two objects are equal according to equals() then their hashCode() should also be same.
3.      It is not required that two different objects which are not equal according to equals(), should return different values.  Though returning different result may lead to better performance of hash tables.
So the second important point here is equal object must have equal hash codes.  Let us take an example which overrides equals() but no hashCode():
public final class PhoneNumber {
    private final int areaCode;
    private final int prefix;
    private final int lineNumber;

    public PhoneNumber(int areaCode, int prefix, int lineNumber) {
        super();
        this.areaCode = areaCode;
        this.prefix = prefix;
        this.lineNumber = lineNumber;
    }

    @Override
    public boolean equals(Object obj) {
        if (obj == this) {
            return true;
        }
        if (!(obj instanceof PhoneNumber)) {
            return false;
        }
        PhoneNumber pn = (PhoneNumber) obj;
        return pn.areaCode == this.areaCode && pn.prefix == this.prefix && pn.lineNumber == this.lineNumber;
    }
}
Now if the calling application is something like this:
Map<PhoneNumber, String> m = new HashMap<PhoneNumber, String>();
m.put(new PhoneNumber(100, 200, 300), "Radha");
m.get(new PhoneNumber(100, 200, 300));
As you can see there are 2 objects here.  1st one is creating the object in put() and the second is getting the created object.  Functionally the get() should be able to get the object, but it won’t.  It will return null.  The reason to this is we have not overridden hashCode().  What happens here is when we are calling put() the hash code is calculated and put in some bucket.  Now when get() is called it will return a different hash code which may be in different bucket.  So as the hash codes are not matching, it won’t further go ahead and match equals() and will return null.
In order to fix this let us add a hashCode() to this class.
@Override
public int hashCode() {
    return 42;
}
As you can see we are returning a constant value, which is not a good practice, but this will solve the issue as the hash code will always be same.
*Note: Please avoid returning constant.  It is just an example to show how it works
Let us discuss on the best recipe for writing a hashCode():
1.     Have a constant non zero value.  It can be anything, for e.g. 17.  Store it as result.
2.     For each field f of your object(which are used in equals()) do the following:
a. Compute an int hash code c for the field:
i. If the field is boolean, compute (f ? 1 : 0)
ii. If the field is byte, char, short or int compute (int)f
iii.   If the field is long compute (int)(f ^ (f >>>32))
iv. If the field is float compute Float.floatToIntBits(f)
v. If the field is double compute Double.doubleToLongBits(f)
vi. If the field is an object reference and this class equals()compares field by recursively invoking equals() then recursively call the hashCode() on the field by above method.  If a more complex comparison is needed, compute a canonical representation for this field and call hash code on the canonical representation.  If the value of the field is null return 0(you can return any other constant, 0 is tradition)
vii. If the field is an array, then invoke recursively all the elements applying the above rules.
b. Combine the hash code c computed in step 2.a into result as,
Result = 31* result + c
Here 31 is used because it is an odd prime.  If 2 might have been chosen then the information might have been lost because 2 is equivalent to shifting.  So it is better to choose an odd prime.
3.   Return result
So let’s have an example with this recipe:
@Override
    public int hashCode() {
        int result = 17;
        result = 31 * result + areaCode;
        result = 31 * result + prefix;
        result = 31 * result + lineNumber;
        return result;
}
This will give desired result for the PhoneNumber class. 
We can also write a hash code with lazily initialize variable if required.  That means the hash code is created and stored when the first time the object is created.  Let us see an example:
private volatile int hashCode;

    @Override
    public int hashCode() {
        int result = hashCode;
        if (result == 0) {
            result = 17;
            result = 31 * result + areaCode;
            result = 31 * result + prefix;
            result = 31 * result + lineNumber;
            hashCode = result;
        }
        return result;
    }
So this was how to override hashCode().  A cautionary item here is, do not exclude significant parts of an object from the hash code to improve performance, because although this may result in running the hash function faster but it may degrade hash table’s performance. 

*Note: You can alternatively use HashCodeBuilder from org.apache.commons.lang.builder.HashCodeBuilder. On how to use it see http://commons.apache.org/lang/api-2.5/org/apache/commons/lang/builder/HashCodeBuilder.html
{Courtesy: Effective Java - Joshua Bloch} 




No comments:

Post a Comment