Why do dictionaries in Python check for a key-existing during assignment?

155 views Asked by At

So I heard that dict instances in Python will check if a key already exists before assigning a value. This seems very redundant to me, and I will explain why with an example.
EXAMPLE:

if(key in dictionary):
   dictionary[key] = dictionary[key] + 1
else:
   dictionary[key] = 1

So you can see in this example that, there would be no need to check if a key already exists for dictionary[key] = dictionary[key] + 1. Because the dictionary[key] in the right hand side of the assignment (dictionary[key] + 1) will retrieve the key if it is present, or raise a KeyError if the key does not already exist in the dict instance (note that the check to see if the key is present on the right hand side is not related to assignment). Whether a key already existed or not in the check on the left hand side (dictionary[key] =), will do nothing to the outcome. If no check was done, and you just calculated the hash-value for the key and assigned it that value. It would not matter if the key existed previously or not. ALSO NOTE: That with or without the if(key in dictionary): a redundant check will be performed.

For the second portion of the example dictionary[key] = 1. The presence of a key or not in the check from dictionary[key] does not change the outcome, whatever value was at that previous key (if it was present) will just be overwritten by the 1. So what is the point of the check before the set.

Maybe there is a reason for the implicit check during ASSIGNMENT before the set. It seems that there is a seperate check performed if the check is for a dict instance on the left or right hand side of the assignment. Because a missing key does not throw a KeyError on the left hand side, a new key is just made instead. I can't see an example of the need for the check prior to a set, but if you do! I would love to hear.

1

There are 1 answers

5
wim On BEST ANSWER

It is actually true that Python will check whether a key is already existing in the dict when setting a new value. However, this check happens at the C level, not in Python code like shown in the question. The check is done by a helper function _Py_dict_lookup in CPython.

There are good reasons why the implementation must check if key is already existing:

  • When replacing a key, the reference count for the old value must be decremented (no memory leaks)
  • When replacing a key, CPython guarantees a backing storage resize will not be triggered (user can modify an item during dict iteration).
  • To distinguish "key exists" from "hash collision", an equality check may be necessary between new key and old key (collision handling)

When writing Python code, you don't need to do this kind of check. It would be more normal to use something like:

dictionary[key] = dictionary.get(key, 0) + 1

Or, if you're just counting keys, use a dict subclass such as collections.Counter instead.