Using IEqualityComparer GetHashCode with a tolerance

787 views Asked by At

I am trying to implement an IEqualityComparer that has a tolerance on a date comparison. I have also looked into this question. The problem is that I can't use a workaround because I am using the IEqualityComparer in a LINQ .GroupJoin(). I have tried a few implementations that allow for tolerance. I can get the Equals() to work because I have both objects but I can't figure out how to implement GetHashCode().

My best attempt looks something like this:

public class ThingWithDateComparer : IEqualityComparer<IThingWithDate>
{
    private readonly int _daysToAdd;

    public ThingWithDateComparer(int daysToAdd)
    {
        _daysToAdd = daysToAdd;
    }

    public int GetHashCode(IThingWithDate obj)
    {
        unchecked
        {
            var hash = 17;
            hash = hash * 23 + obj.BirthDate.AddDays(_daysToAdd).GetHashCode();
            return hash;
        }
    }

    public bool Equals(IThingWithDate x, IThingWithDate y)
    {
        throw new NotImplementedException();
    }
}

public interface IThingWithDate
{
    DateTime BirthDate { get; set; }
}

With .GroupJoin() building a HashTable out of the GetHashCode() it applies the days to add to both/all objects. This doesn't work.

2

There are 2 answers

1
Servy On BEST ANSWER

The problem is impossible, conceptually. You're trying to compare objects in a way that doesn't have a form of equality that is necessary for the operations you're trying to perform with it. For example, GroupJoin is dependant on the assumption that if A is equal to B, and B is equal to C, then A is equal to C, but in your situation, that's not true. A and B may be "close enough" together for you to want to group them, but A and C may not be.

You're going to need to not implement IEqualityComparer at all, because you cannot fulfill the contract that it requires. If you want to create a mapping of items in one collection to all of the items in another collection that are "close enough" to it then you're going to need to write that algorithm yourself (doing so efficiently is likely to be hard, but doing so inefficiently isn't shouldn't' be that difficult), rather than using GroupJoin, because it's not capable of performing that operation.

1
Bradley Uffner On

I can't see any way to generate a logical hash code for your given criteria.
The hash code is used to determine if 2 dates should stick together. If they should group together, than they must return the same hash code.

If your "float" is 5 days, that means that 1/1/2000 must generate the same hash code as 1/4/2000, and 1/4/2000 must generate the same hashcode as 1/8/2000 (since they are both within 5 days of each other). That implies that 1/1/2000 has the same code as 1/8/2000 (since if a=b and b=c, a=c).

1/1/2000 and 1/8/2000 are outside the 5 day "float".