TL;DR
How can I get superkeys to be autovivified in a Python dict when assigning values to subkeys, without also getting them autovivified when checking for subkeys?
Background: Normally in Python, setting values in a nested dictionary requires manually ensuring that higher-level keys exist before assigning to their sub-keys. That is,
my_dict[1][2] = 3
will not reliably work as intended without first doing something like
if 1 not in my_dict:
my_dict[1] = {}
Now, it is possible to set up a kind of autovivification by making my_dict an instance of a class that overrides __missing__, as shown e.g. in https://stackoverflow.com/a/19829714/6670909.
Question: However, that solution silently autovivifies higher-level keys if you check for the existence of a sub-key in such a nested dict. That leads to the following unfortunateness:
>>> vd = Vividict()
>>> 1 in vd
False
>>> 2 in vd[1]
False
>>> 1 in vd
True
How can I avoid that misleading result? In Perl, by the way, I can get the desired behavior by doing
no autovivification qw/exists/;
And basically I'd like to replicate that behavior in Python if possible.
This is not an easy problem to solve, because in your example:
my_dict[1]results in a__getitem__call on the dictionary. There is no way at that point to know that an assignment is being made. Only the last[]in the sequence is a__setitem__call, and it can't succeed unlessmydict[1]exists, because otherwise, what object are you assigning into?So don't use autovivication. You can use
setdefault()instead, with a regulardict.Now that's not exactly pretty, especially when you are nesting more deeply, so you might write a helper method:
But even better is to wrap this into a new
__setitem__that takes all the indexes at once, instead of requiring the intermediate__getitem__calls that induce the autovivication. This way, we know from the beginning that we're doing an assignment and can proceed without relying on autovivication.For consistency, you could also provide
__getitem__that accepts keys in a tuple as follows:The only downside I can think of is that we can't use tuples as dictionary keys as easily: we have to write that as, e.g.
my_dict[(1, 2),].