Iterables difference and common elements

62 views Asked by At

Given 2 iterables/lists, the goal is to extract the common and different elements from both lists.

For example, given:

x, y = [1,1,5,2,2,3,4,5,5], [2,3,4,5]

The goal is to achieve:

common = [2,3,4,5]
x_only = [1,1,5,2,5]
y_only = []

Explanation:

  • when it comes to the element 2, x has 2 counts and y has 1 count, so [2] will be in common and the other [2] will be in x_only.
  • Similarly for 5, x has 3 counts and y has 1 count. so [5] will be common and the other [5,5] will be in x_only.
  • The order of the elements in common, x_only and y_only is not important.

I've tried:

from collections import Counter

x, y = [1,1,5,2,3,4,5], [2,3,4,5]

x_only = list((Counter(x) - Counter(y)).elements())
y_only = list((Counter(y) - Counter(x)).elements())
common = list((Counter(x) & Counter(y)).elements())

The above attempt achieves the objective but it seems a little redundant in repeating the multiple sets/counter subtraction and intersection. It will work for well for small iterables but not a large list e.g. 1-10 billion items.

1

There are 1 answers

0
ShadowRanger On

Your solution is fine, just remove the redundancy by only constructing each unique Counter once, by making the intersection first so less work is done to compose the unique results, and this will let you do most of the work in place (so only three Counter objects constructed, not nine):

from collections import Counter

x, y = [1,1,5,2,3,4,5], [2,3,4,5]

cx = Counter(x)
cy = Counter(y)
both = cx & cy
cx -= both
cy -= both
x_only = list(cx.elements())
y_only = list(cy.elements())
common = list(both.elements())

It's more lines, but it's less work being done, and any given line is far less complex.