Should hashCode() only use the subset of immutable fields of those used in equals()?

1.7k views Asked by At

Situation

I needed to overwrite equals() and as it is recommended I also overwrote the hashCode() method using the same fields. Then, when I was looking at a set, that contained only the one object I got the frustrating result of

set.contains(object)
=> false

while

set.stream().findFirst().get().equals(object)
=> true

I understand now, that this is due to changes that were made to object after it was added to set which again changed its hashCode. contains then looks at the wrong key and can't find the object.

My requirements for the implementation are

  • mutable fields are needed to correctly implement equals()
  • use these objects safely in hash-based Collections or Maps such ash HashSet even if they are prone to changes.

which conflicts with the convention that

Question

Are there any dangers to using only a subset of fields which are used in equals() to calculate hashCode() instead of using all?

More specifically this would mean: equals() uses a number of fields of the object whereas hashCode() only uses those fields that are used in equals() and that are immutable.

I think this should be okay, because

  • the contract is fullfilled: equal objects will produce the same hashCode, while the same hashCode does not necesairly mean that the objects are the same.
  • The hashCode of an object stays the same, even if an object is exposed to changes and therefore will be found in a HashSet before and after those changes.

Related posts that helped me understand my problem but not how to solve it: What issues should be considered when overriding equals and hashCode in Java? and Different fields for equals and hashcode

4

There are 4 answers

0
Eugene On

That's perfectly valid to me. Suppose you have a Person:

 final int name; // used in hashcode
 int income; // name + income used in equals

name decides where the entry will go (think HashMap) or which bucket will be chosen.

You put a Person as a Key inside HashMap : according to hashcode it goes to some bucket, second for example. You upgrade the income and search for that Person in the map. According to hashcode it must be in the second bucket, but according to equals it's not there:

 static class Person {
    private final String name;

    private int income;

    public Person(String name) {
        super();
        this.name = name;
    }

    public int getIncome() {
        return income;
    }

    public void setIncome(int income) {
        this.income = income;
    }

    public String getName() {
        return name;
    }

    @Override
    public int hashCode() {
        return name.hashCode();
    }

    @Override
    public boolean equals(Object other) {
        Person right = (Person) other;

        return getIncome() == right.getIncome() && getName().equals(right.getName());
    }

}

And a test:

    HashSet<Person> set = new HashSet<>();
    Person bob = new Person("bob");
    bob.setIncome(100);
    set.add(bob);

    Person sameBob = new Person("bob");
    sameBob.setIncome(200);

    System.out.println(set.contains(sameBob)); // false

What you are missing I think is the fact that hashcode decides a bucket where an entry goes (there could be many entries in that bucket) and that's the first step, but equals decides if that is well, an equal entry.

The example that you provide is perfectly legal; but the one you linked is the other way around - it uses more fields in hashcode making it thus incorrect.

If you understand these details that first hashcode is used to understand where and Entry might reside and only later all of them (from the subset or bucket) are tried to be found via equal - your example would make sense.

0
Ralf Kleberhoff On

For HashSet and similar collections/maps, it's a valid solution to have hashCode() use only a subset of the fields from the equals() method. Of course, you have to think about how useful the hash code is to reduce collisions in the map.

But be aware that the problem comes back if you want to use ordered collections like TreeSet. Then you need a comparator that never gives collisions (returns zero) for "different" objects, meaning that the set can only contain one of the colliding elements. Your equals() description implies that multiple objects will exist that differ only in the mutable fields, and then you lose:

  • Including the mutable fields in the compareTo() method can change the comparison sign, so that the object needs to move to a different branch in the tree.
  • Excluding the mutable fields in the compareTo() method limits you to have maximum one colliding element in the TreeSet.

So I'd strongly recommend to think about your object class'es concept of equality and mutability again.

0
DodgyCodeException On

It's ok for hashCode() to use a subset of the fields that equals() uses, although it may possibly give you a slight performance drop.

Your problem seems to be caused by modifying the object, while still inside the set, in a way that alters the functioning of hashCode() and/or equals(). Whenever you add an object to a HashSet (or as the key in a HashMap), you must not subsequently modify any fields of that object that are used by equals() and/or hashCode(). Ideally, all fields used by equals() should be final. If they can't be, you must treat them as though they are final whilst the object is in the set.

The same goes for TreeSet/TreeMap, too, but applies to fields used by compareTo().

If you really need to modify the fields that are used by equals() (or by compareTo() in the case of a TreeSet/TreeMap), you must:

  1. First, remove that object from the set;
  2. Then modify the object;
  3. And finally add it back to the set.
0
ggkekas On

The contract would indeed be fulfilled. The contract imposes that .equal() objects have ALWAYS the same .hashCode(). The opposite doesn't have to be true and I wonder with the obsession of some people and IDEs to apply exactly that practice. If this was possible for all possible combinations, then you would discover the perfect hash function.

BTW, IntelliJ offers a nice wizard when generating hashCode and equals by treating those two methods separately and allowing to differentiate your selection. Obviously, the opposite, aka offering more fields in the hashCode() and less fields in the equals() would violate the contract.