Performance of element-compare in java collections

393 views Asked by At

Problem:

Given two Collection<?>s, check if both contain the same elements.

  • Assuming that the actual implementation of the collection is unknown
  • Assuming that the elements do not occur in the same order
  • Assuming that no element does occur twice in the same collection

Solution 1:

boolean equals = c1.containsAll(c2) && c2.containsAll(c1);

Solution 2:

boolean equals = new HashSet<?>(c1).equals(new HashSet<?>(c2));

I would assume Solution 2 to be more efficient (O(n)) than Solution 1 (O(n^2)).

Am I correct or did I miss something?

2

There are 2 answers

4
dimo414 On BEST ANSWER

The Big O complexity of these is correct, Solution 1 involves iterating over one list (O (n)) for each item in the other list, which is O (n^2). Solution 2 involves two O (n) copies and iterating over one set (O (n)) and doing O (1) .contains() checks on the other set. All told, that's just O (n).

But depending on your constraints you can do better (not asymptotically better, just a better implementation).

  • Since we're assuming no duplicate elements, there's no need to do the second .containsAll() check. Simply check they're the same size (which might be O (n), but it's still better than duplicating an O (n^2) check) then do the .containsAll().

  • There's no need to convert c2 into a Set, since it will be iterated over anyways; just convert c1 and use .containsAll().

  • You can use instanceof Set to test if c1 or c2 is already a Set, and use that object's .containsAll() method; this will run in O (n) time even if the other object is not a set, and avoids the copying overhead that Solution 2 has.

0
maaartinus On

As stated by dimo414, in general, yes. But you can always do better (not asymptotically better, just faster). And it gets more complicated:

if (c1.size() != c2.size()) {
    return false;
} else if (c1 instanceof Set) {
    return c1.containsAll(c2);
} else if (c2 instanceof Set) {
    return c2.containsAll(c1);
} else {
    return new HashSet<>(c1).containsAll(c2);
}

And there are collections having slow size, which you may want to handle specially...