When working on an AoC puzzle, I found I wanted to subtract lists (preserving ordering):
def bag_sub(list_big, sublist):
result = list_big[:]
for n in sublist:
result.remove(n)
return result
I didn't like the way the list.remove call (which is itself O(n)) is contained within the loop, that seems needlessly inefficient. So I tried to rewrite it to avoid that:
def bag_sub(list_big, sublist):
c = Counter(sublist)
result = []
for k in list_big:
if k in c:
c -= Counter({k: 1})
else:
result.append(k)
return result
Is this now O(n), or does the
Counter.__isub__usage still screw things up?This approach requires that elements must be hashable, a restriction which the original didn't have. Is there an O(n) solution which avoids creating this additional restriction? Does Python have any better "bag" datatype than
collections.Counter?
You can assume sublist is half the length of list_big.
This would be expected-case O(n), except that when
Counter.__isub__discards nonpositive values, it goes through every key to do so. You're better off just subtracting 1 from the key value the "usual" way and checkingc[k]instead ofk in c. (c[k]is 0 fork not in c, so you don't need anincheck.)Only if the inputs are sorted, in which case a standard variant of a mergesort merge can do it.
collections.Counteris Python's bag.