Is there a way to derive IEqualityComparer from IComparer?

1.7k views Asked by At

TL;DR I'm looking for a way to obtain IEqualityComparer<T> from IComparer<T>, no matter which datatype is T, including case-insensitive options if T is string. Or I need a different solution for this problem.

Here's full story: I'm implementing simple, generic cache with LFU policy. Requirement is that it must be possible to select whether the cache will be case sensitive or case insensitive -- if string happens to be the datatype for cache keys (which is not necessary). In the solution I primarily develop the cache for, I expect hundreds of billions of cache lookups, and cache sizes of max 100.000 entries. Because of that numbers I immediately resigned from using any string manipulation that causes allocations (such as .ToLower().GetHashCode() etc.) and instead opted to use IComparer and IEqualityComparer, as they are standard BCL features. User of this cache can pass the comparers to constructor. Here are relevant fragments of the code:

public class LFUCache<TKey,TValue>
{
    private readonly Dictionary<TKey,CacheItem> entries;
    private readonly SortedSet<CacheItem> lfuList;

    private class CacheItem
    {
        public TKey Key;
        public TValue Value;
        public int UseCount;
    }

    private class CacheItemComparer : IComparer<CacheItem>
    {
        private readonly IComparer<TKey> cacheKeyComparer;

        public CacheItemComparer(IComparer<TKey> cacheKeyComparer)
        {
            this.cacheKeyComparer = cacheKeyComparer;
            if (cacheKeyComparer == null)
                this.cacheKeyComparer = Comparer<TKey>.Default;
        }

        public int Compare(CacheItem x, CacheItem y)
        {
            int UseCount = x.UseCount - y.UseCount;
            if (UseCount != 0) return UseCount;
            return cacheKeyComparer.Compare(x.Key, y.Key);
        }
    }

    public LFUCache(int capacity, IEqualityComparer<TKey> keyEqualityComparer,
                  IComparer<TKey> keyComparer)  // <- here's my problem
    {
        // ...
        entries = new Dictionary<TKey, CacheItem>(keyEqualityComparer);
        lfuList = new SortedSet<CacheItem>(new CacheItemComparer(keyComparer));
    }
    // ...
}

The keyEqualityComparer is used to manage cache entries (so e.g. the key "ABC" and "abc" are equal if user wants to). The keyComparer is used to manage cache entries sorted by UseCount so that it's easy to select the least frequently used one (implemented in CacheItemComparer class).

Example correct usage with custom comparison:

var cache = new LFUCache<string, int>(10000,
    StringComparer.InvariantCultureIgnoreCase,
    StringComparer.InvariantCultureIgnoreCase);

(That looks stupid, but StringComparer implements both IComparer<string> and IEqualityComparer<string>.) The problem is that if user gives incompatible comparers (i.e. case insensitive keyEqualityComparer and case sensitive keyComparer), then the most likely outcome is invalid LFU statistics, and thus lower cache hits at best. The other scenario is also less than desired. Also if the key is more sophisticated (I'll have something resembling Tuple<string,DateTime,DateTime>), it's possible to mess it up more severely.

That's why I'd like to only have a single comparer argument in constructor, but that doesn't seem to work. I'm able to create IEqualityComparer<T>.Equals() with help of IComparer<T>.Compare(), but I'm stuck at IEqualityComparer<T>.GetHashCode() -- which is very important, as you know. If I had got access to private properties of the comparer to check if it's case sensitive or not, I would have used CompareInfo to get hash code.

I like this approach with 2 different data structures, because it gives me acceptable performance and controllable memory consumption -- on my laptop around 500.000 cache additions/sec with cache size 10.000 elements. Dictionary<TKey,TValue> is just used to find data in O(1), and SortedSet<CacheItem> inserts data in O(log n), find element to remove by calling lfuList.Min in O(log n), and find the entry to increment use count also in O(log n).

Any suggestions on how to solve this are welcome. I'll appreciate any ideas, including different designs.

4

There are 4 answers

2
Mike Zboray On BEST ANSWER

As I alluded to in my comment, you could add a helper method that might make things a little simpler for a basic use case:

public class LFUCache<TKey,TValue>
{
    public static LFUCache<TKey, TValue> Create<TComp>(int capacity, TComp comparer) where TComp : IEqualityComparer<TKey>, IComparer<TKey>
    {
        return new LFUCache<TKey, TValue>(capacity, comparer, comparer);
    }
}

and you'd use it like this:

var cache = LFUCache<string, int>.Create(10000, StringComparer.InvariantCultureIgnoreCase);
7
Sebastian Schumann On

Okay next try. Here is an implementation for Add and Touch for LFU:

public class LfuCache<TKey, TValue>
{
    private readonly Dictionary<TKey, LfuItem> _items;

    private readonly int _limit;

    private LfuItem _first, _last;

    public LfuCache(int limit, IEqualityComparer<TKey> keyComparer = null)
    {
        this._limit = limit;
        this._items = new Dictionary<TKey,LfuItem>(keyComparer);
    }

    public void Add(TKey key, TValue value)
    {
        if (this._items.Count == this._limit)
        {
            this.RemoveLast();
        }

        var lfuItem = new LfuItem { Key = key, Value = value, Prev = this._last };
        this._items.Add(key, lfuItem);

        if (this._last != null)
        {
            this._last.Next = lfuItem;
            lfuItem.Prev = this._last;
        }

        this._last = lfuItem;

        if (this._first == null)
        {
            this._first = lfuItem;
        }
    }

    public TValue this[TKey key]
    {
        get
        {
            var lfuItem = this._items[key];
            ++lfuItem.UseCount;

            this.TryMoveUp(lfuItem);

            return lfuItem.Value;
        }
    }

    private void TryMoveUp(LfuItem lfuItem)
    {
        if (lfuItem.Prev == null || lfuItem.Prev.UseCount >= lfuItem.UseCount) // maybe > if you want LRU and LFU
        {
            return;
        }

        var prev = lfuItem.Prev;
        prev.Next = lfuItem.Next;
        lfuItem.Prev = prev.Prev;
        prev.Prev = lfuItem;

        if (lfuItem.Prev == null)
        {
            this._first = lfuItem;
        }
    }

    private void RemoveLast()
    {
        if (this._items.Remove(this._last.Key))
        {
            this._last = this._last.Prev;
            if (this._last != null)
            {
                this._last.Next = null;
            }
        }
    }

    private class LfuItem
    {
        public TKey Key { get; set; }

        public TValue Value { get; set; }

        public long UseCount { get; set; }

        public LfuItem Prev { get; set; }

        public LfuItem Next { get; set; }
    }
}

In my opinion it looks like that Add and Touch is in O(1), isn't it?

Currently I don't see any use case for _first but maybe anyone else need it. To remove an item _last should be enough.

EDIT A single linked list will also do if you don't need a MoveDown operation. EDIT No a single linked list will not work because MoveUp need the Next pointer to change it's Prev pointer.

18
Servy On

It's not possible to implement an IComparer from an IEqualityComparer as you have no way of knowing whether an unequal item is greater than or less than the other item.

It's not possible to implement an IEqualityComparer from an IComparer as there's no way for you to generate a hash code that is in line with the IComparer's identity.

That said, there's no need for you to have both types of comparers in your case. When computing LRU you're comparing the time since an item was used as the primary comparer and then comparing based on a passed in comparer as a tiebreaker. Just remove that last part; don't have a tiebreaker. Let it be undefined which item leaves the cache when there is a tie for the least recently used. When you do that you only need to accept an IEqualityComparer, not an IComparer.

0
Ama On

Instead of taking an IEqualityComparer and an IComparer in your constructor, you could try taking an IComparer and a lambda which defines GetHashCode(). Then build an IEqualityComparer based on if(IComparer==0) and GetHashCode() = lambda.

Although I would say it is small, you still have the risk of getting HashCode mismatches when IComparer returns 0. If you want to make it super clear to the user of your code, you could always extend the strategy by taking two lambdas in the constructor: Func<T,T,int> used for both IComparer and IEqualityComparer, and Func<T,int> for GetHashCode.