Why do std::ranges::set_difference, std::ranges::set_intersection do not work on std::unordered_set?

106 views Asked by At

I know the reason is that those algorithms are specified to require sorted ranges as input, so unordered_set will not work, but I wonder why those algorithms were not specified to understand unordered set containers.

In other words I understand that if I just give a range(2 iterators) to those algorithms they can not know how to efficiently look up elements in that unsorted range, but when I provide it a container that has member .contains() it seems like they could do it(with terrible worst case complexity, but with same complexity as the regular set operations in happy cases).

My guess is that those algorithms already have gigantic list of requirements, and that work to handle this case(for example making sure they do not work on multiset containers) was not deemed worthy of effort for such rarely used algorithms.

But it is possible I am missing some other reason why this is not possible.

2

There are 2 answers

1
Marshall Clow On

You mentioned the answer in "terrible worst case complexity".

using sorted ranges you can calculate the difference/intersection in linear complexity (M+N), but with unordered collections you have quadratic complexity, because even if the lookups in the unordered container are constant time (which are not a given), then you have to compare each element of the first container to each element of the second container, so you have M*N comparisons.

Where M and 'N' are the sizes the containers.

0
Jan Schultke On

My guess is that those algorithms already have gigantic list of requirements

That's not the case. The problem is the opposite: algorithms in <algorithm> are designed to work with iterators, with no awareness of the container. The requirements are quite minimal, though the iterators need to point to an ordered range to make the algorithm efficient.

but when I provide it a container that has member .contains() it seems like they could do it

They could not do it because the iterators don't have any way to access the container. The iterators cannot use .contains() at all. You would need a completely new overload for std::set_difference which works with a std::unordered_set directly, similar to std::erase_if(std::unordered_set)

Alternatively, member functions like .difference(...) could be added to std::unordered_set. In the meantime, you can easily make your function.

std::unordered_set<int> a, b;
// ...

// a can be turned into the intersection of a and b via:
std::erase_if(a, [&b](int e) {
    return !b.contains(e);
});

See also: C++ library method for intersection of two unordered_set