Edit 2: It was suggested that this is a copy of a similar question. I'd disagree since my question focuses on speed, while the other question asks what is more "readable" or "better" (without defining better). While the questions are similar, there is a big difference in the discussion/answers given.
EDIT: I realise from the questions that I could have been clearer. Sorry for code typos, yes it should be using the proper python operator for addition.
Regarding the input data, I just chose a list of random numbers since that's a common sample. In my case I'm using a dict where I expect a lot of keyerrors, probably 95% of the keys will not exist, and the few that exist will contain clusters of data.
I'm interested in a general discussion though, regardless of the input data set, but of course samples with running times are interesting.
My standard approach would be like so many other posts to write something like
list = (100 random numbers)
d = {}
for x in list:
if x in d:
d[x]+=1
else:
d[x]=1
But I just came to think of this being faster, since we dont have to check if the dictionary contains the key. We just assume it does, and if not, we handle that. Is there any difference or is Python smarter than I am?
list = (100 random numbers)
d = {}
for x in list:
try:
d[x]+=1
except KeyError:
d[x] = 1
The same approach with indexes in an array, out of bounds, negative indexes etc.
Your claim
is absolutely falsedepends on the input.If you have a diverse set of keys, and hits the
except
block often, the performance is not good. If thetry
block is dominant thetry/except
idiom can be performant on smaller lists.Here is a benchmark showing several ways to do the same thing:
I have included a small benchmark function based on
timeit
that prints the functions from Slowest to Fastest with a percent difference between them.Here is the results for Python 3:
And Python 2:
So -- it depends.
Conclusions:
TheCounter
method is almost always among the slowestCounter
method is among the slowest on Python 2 but by far the fastest on Python 3.4try/except
version is usually among the slowestif key in dict
version is predictably one of the best/fastest regardless of the size or key count{}.fromkeys(tgt, 0)
is very predictabledefaultdict
version is fastest on larger lists. Smaller lists the longer setup time is amortized over too few elements.