The goal is to find the intersection of 2 arrays. I know you're supposed to use sets to get a faster runtime. However, in my notebook I was surprised to find that my runtime is not faster with the sets method. (89 ms (array) vs. 220 ms (sets)). Is there something wrong with my code or something I'm misunderstanding?
A = range(50,1000000)
B = range(-1000000,60)
%%time
# O(m*n)?
def intersection(A, B):
array = []
for i in A:
if i in B:
array.append(i)
return array
intersection(A,B)
%%time
# O(1)?
def intersection(A, B):
set_a = set(A)
set_b = set(B)
return [i for i in set_a if i in set_b]
# return set_a.intersection(set_b)
intersection(A,B)
You are not comparing arrays to sets.
You are comparing ranges to sets.
Checking the membership in a range is almost instantaneous (two comparisons: one with the lower, one with the upper bound).
Checking the membership in an array would take millions of iterations in this case, which would be substantially longer than membership checks for sets.
If you convert the
ranges tolists first, you see a huge difference:It prints something like
There is no point trying it with
1000000- it would take hours.Furthermore, in order to measure something vaguely realistic, you should shuffle the numbers, otherwise branch prediction might distort the results.