Java HashMap, hashCode() equals() - how to be consistent with multiple keys?

637 views Asked by At

I have an ID class. It refers to identification for a unique person. There are two forms of ID. Username and user number. If either username matches or user number matches, the ID's refer to the same person. I want to use the ID class as a key in a hashmap for looking up persons. I overrode the equals method to make it clear that if two ID's are equal, that means they refer to the same person. I am under the presumption that I need to override the hashCode() method because I want a hashmap of < key: ID, value: Person >. But I don't know how. How do I make the ID class an acceptable key for my HashMap?

public final class ID {

    // These are all unique values. No 2 people can have the same username or id value.
    final String username_;
    final long idNumber_;

    public ID(String username, Byte[] MACaddress, long idNumber) {
    username_ = username;
    idNumber_ = idNumber;
    }

    /**
     * Checks to see if the values match. 
     * If either username or idNumber matches, then both ID's refer to the same person.
     */
    public boolean equals(Object o) {
    if (!(o instanceof ID)) {
        return false;
    }
    ID other = (ID) o;
    if (this.username_ != null && other.username_ != null) {
        if (this.username_.equals(other.username_)) {
            return true;
        }
    }
    if (this.idNumber_ > 0 && other.idNumber_ > 0) {
        if(this.idNumber_ == other.idNumber_) {
            return true;
        }
    }
    return false;
    }
}

Followup: What if I wanted to add a third field for unique social security number? How would that change my person lookup hashmap and the ID class?

Note that the Wikipedia page for hashCode says:

The general contract for overridden implementations of this method is that they behave in a way consistent with the same object's equals() method: that a given object must consistently report the same hash value (unless it is changed so that the new version is no longer considered "equal" to the old), and that two objects which equals() says are equal must report the same hash value.

Update

Best solution thus far: Make a separate HashMap for each immutable unique identifier and then wrap all the HashMaps in a wrapper object.

1

There are 1 answers

11
Jon Skeet On BEST ANSWER

I think your question really goes a long way beyond plain implementation of hashCode - it's really about the design of when two IDs should be considered equal.

If you have multiple fields all of which are unique, you're actually in a tricky situation. Ignoring the MAC address part, there's nothing in the ID code itself to stop:

ID a = new ID("a", 0);
ID b = new ID("b", 0);
ID c = new ID("a", 1);

Now by your rules, those three IDs shouldn't all be able to co-exist - because they're all equal in one way or another. (a and b share the same name, and a and c share the same number.)

Additionally, it looks like the number and name are both effectively optional - that's likely to end up being painful, in my experience.

It may still make sense to keep your class as it is, but the rules around "No two entities can have the same name" and "No two entities can have the same number" aren't really the same as "no two entities can have the same ID", so you'd have to enforce that separately.

It's entirely possible that you'll want to keep a HashMap<String, Entity> for "entities by name" and a HashMap<Integer, Entity> for "entities by number". I would then implement equals and hashCode in ID to check for complete equality - so that after looking up an ID by just one part of the ID, you can check that the ID you've found is actually the right one.

Alternatively, change your identity scheme to only have a single part, so that you never need to worry about this. Even if you have a "secondary identifier" of some kind, it's worth deciding on one of them being your primary identifier which you normally use to look entities up, store them etc - and then you'd just have a secondary mapping (whether that's in memory or in a database etc) between the primary and secondary identifiers. If all your "real" identifiers are optional, you may even want to have a surrogate identifier which is guaranteed to exist, and then all the mappings between that and the secondary identifiers can be optional. (They may even be mutable - you may be able to "discover" and record a social security number later against an existing entity, for example.)