Best practices for Entity Framework entities override Equals and GetHashCode

6.3k views Asked by At

I want to check equality between two entities with one-to-many relationships inside them.

So obviously I overrode the Object.Equals method, but then I get the CS0659 compiler warning: 'class' overrides Object.Equals(object o) but does not override Object.GetHashCode().

I overrode the Object.GetHashCode, but then, Resharper told me that the GetHashCode method should return the same result for all object life cycle, and will used in mutable objects. (docs)

public class Computer
{
    public long Id { get; set; }
    public ICollection<GPU> GPUs { get; set; } = new List<GPU>();

    public override bool Equals(object obj)
    {
        return obj is Computer computer &&
               GPUs.All(computer.GPUs.Contains);
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(GPUs);
    }
}

public class GPU
{
    public long Id { get; set; }
    public int? Cores { get; set; } = null;

    public override bool Equals(object obj)
    {
        return obj is GPU gpu &&
               Cores == gpu.Cores;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(Cores);
    }
}

I don't know what should I prefer:

  • Overriding the Equals method without overriding GetHashCode, or
  • Overriding the GetHashCode with immutable data?
1

There are 1 answers

1
Harald Coppoolse On BEST ANSWER

Entity Framework uses its own smart methods to detect object equality. This is for instance used if you call SaveChanges: the values of fetched objects are matched with the values of updated objects to detect whether a SQL update is needed or not.

I'm not sure whether your definitions of equality would mess with this equality checking, causing some unchanged items to be updated in the database, or even worse, some changed data not to be updated in the database.

Database equality

Keep in mind that your entity classes (the classes that you put in the DbSet<...>) represent the tables in your database and the relations between the tables.

When should two items extracted from your database considered to represent the same object? Is it when they have same values? Can't we have two Persons named "John Doe", born on the 4th of July in one database?

The only way you can use to detect that two extracted Persons from the database represent the same Person is by checking the Id. The fact that some non-primary key values differ only tells you that the changed data is not updated in the database, not that it is a different Person.

Override Equals vs Create EqualityComparer

My advice would be, to keep your table representations as simple as possible: only the columns of the table (non-virtual properties) and the relations between the tables (virtual properties). No members, no Methods, nothing.

If you need extra functionality, create extension functions of the classes. If you need non-standard equality comparison methods, create a separate equality comparer. Users of your class can decide whether they want to use the default comparison method or your special comparison method.

This is all comparable as the various kinds of String Comparers: StringComparer.OrdinalIgnorCase, StringComparer.InvariantCulture, etc.

Back to your question

It seems to me that you want a Gpu comparer that does not check the value of Id: two items that have different Id, but same values for other properties are considered equal.

class GpuComparer : EqualityComparer<Gpu>
{
    public static IEqualityComparer<Gpu> IgnoreIdComparer {get;} = new GpuComparer()

    public override bool Equals(Gpu x, Gpu y)
    {
        if (x == null) return y == null; // true if both null, false if x null but y not
        if (y == null) return false;     // because x not null
        if (Object.ReferenceEquals(x, y)) return true;
        if (x.GetType() != y.GetType()) return false;

        // if here, we know x and y both not null, and of same type.
        // compare all properties for equality
        return x.Cores == y.Cores;
    }
    public override int GetHasCode(Gpu x)
    {
        if (x == null) throw new ArgumentNullException(nameof(x));

         // note: I want a different Hash for x.Cores == null than x.Cores == 0!

         return (x.Cores.HasValue) ? return x.Cores.Value.GetHashCode() : -78546;
         // -78546 is just a value I expect that is not used often as Cores;
    }
}

Note that I added the test for same type, because if y is a derived class of Gpu, and you would ignore that they are not the same type, then maybe Equals(x, y), but not Equals(y, x), which is one of the prerequisites of equality functions

Usage:

IEqualityComparer<Gpu> gpuIgnoreIdComparer = GpuComparer.IgnoreIdComparer;
Gpu x = new Gpu {Id = 0, Cores = null}
Gpu y = new Gpu {Id = 1, Cores = null}

bool sameExceptForId = gpuIgnoreIdComparer.Equals(x, y);

x and y will be considered equal

HashSet<Gpu> hashSetIgnoringIds = new HashSet<Gpu>(GpuComparer.IgnoreIdComparer);
hashSetIgnoringIds.Add(x);
bool containsY = hashSetIgnoringIds.Contains(y); // expect true

A comparer for Computer will be similar. Apart that you forgot to check for null and types, I see some other problems in the way you want to do the equality checking:

  • it is possible to assign null to your collection of Gpus. You have to solve this that it does not throw an exception. Is a Computer with null Gpus equal to a Computer with zero Gpus?
  • Apparently the order of the Gpus is not important to you: [1, 3] is equal to [3, 1]
  • Apparently the number of times that a certain GPU appears is not important: [1, 1, 3] is equal to [1, 3, 3]?

.

class IgnoreIdComputerComparer : EqualityComparer<Computer>
{
    public static IEqualityComparer NoIdComparer {get} = new IgnoreIdComputerCompare();


    public override bool (Computer x, Computer y)
    {
        if (x == null) return y == null;not null
        if (y == null) return false;
        if (Object.ReferenceEquals(x, y)) return true;
        if (x.GetType() != y.GetType())  return false;

        // equal if both GPU collections null or empty,
        // or any element in X.Gpu is also in Y.Gpu ignoring duplicates
        // using the Gpu IgnoreIdComparer
        if (x.Gpus == null || x.Gpus.Count == 0)
            return y.Gpus == null || y.Gpus.Count == 0;

        // equal if same elements, ignoring duplicates:
        HashSet<Gpu> xGpus = new HashSet<Gpu>(x, GpuComparer.IgnoreIdComparer);
        return xGpush.EqualSet(y);
    }

    public override int GetHashCode(Computer x)
    {
        if (x == null) throw new ArgumentNullException(nameof(x));

        if (x.Gpus == null || x.Gpus.Count == 0) return -784120;

         HashSet<Gpu> xGpus = new HashSet<Gpu>(x, GpuComparer.IgnoreIdComparer);
         return xGpus.Sum(gpu => gpu);
    }
}

TODO: if you will be using large collections of Gpus, consider a smarter GetHashCode