Is it a good practice to instantiate a HashSet from an IEnumerable before using Contains()?

Question

Is it a good practice to instantiate a HashSet from an IEnumerable before using Contains()?

209 views Asked by Amessihel At 03 August 2023 at 16:56

The piece of code below filters an IEnumerable<T> with another, used as a blacklist. The filtered collection iterates over content fetched remotely (lazy loading, YouTube API).

IEnumerable<CustomType> contentThatCanBeHuge = this.FetchContentThatCanBeHuge();
IEnumerable<string> blackListContent = this.FetchBlackListContent();
return contentThatCanBeHuge.Where(x => !blackListContent.Contains(x.Id));

The Enumerable.Contains method is O(n) in time complexity, so the Enumerable.Where call could take a while.

In the other hand, HashSet<T>.Contains is O(1). Instantiating a HashSet<T> from an IEnumerable<T> seems to be O(n).

If the blacklist is about to be used multiple times, and without taking space complexity into account, is it a good approach to turn it into a HashSet<T> before using it or is this just premature optimization?

Original Q&A

There are 1 answers

**Dmitry Bychenko** · Accepted Answer · 2023-08-03T17:51:41+00:00

Let size of blackListContent be m and size of contentThatCanBeHuge is n.

If we don't use HashSet, time complexity is O(n * O(m)) = O(n * m), space complexity is O(1): for each item in contentThatCanBeHuge we should scan entire blackListContent.

If we use HashSet, time complexity is O(m) + O(n * O(1)) = O(n + m), space complexity is O(m):

We create HashSet - O(m) time complexity, O(m) space complexity.
For each item in contentThatCanBeHuge we should check it with rspect of HashSet - O(n * O(1)) time complexity.

So far so good HashSet makes the code faster but we consumes more memory.

TechQA.

Is it a good practice to instantiate a HashSet from an IEnumerable before using Contains()?

There are 1 answers

Related Questions in C#

Related Questions in PERFORMANCE

Related Questions in TIME-COMPLEXITY

Related Questions in HASHSET

Related Questions in PREMATURE-OPTIMIZATION

Popular Questions

Trending Questions