How to implement autovivification for nested dictionary ONLY when assigning values?

Question

How to implement autovivification for nested dictionary ONLY when assigning values?

1k views Asked by J. Lerman At 08 February 2017 at 20:03

TL;DR
How can I get superkeys to be autovivified in a Python dict when assigning values to subkeys, without also getting them autovivified when checking for subkeys?

Background: Normally in Python, setting values in a nested dictionary requires manually ensuring that higher-level keys exist before assigning to their sub-keys. That is,

my_dict[1][2] = 3

will not reliably work as intended without first doing something like

if 1 not in my_dict:
    my_dict[1] = {}

Now, it is possible to set up a kind of autovivification by making my_dict an instance of a class that overrides __missing__, as shown e.g. in https://stackoverflow.com/a/19829714/6670909.

Question: However, that solution silently autovivifies higher-level keys if you check for the existence of a sub-key in such a nested dict. That leads to the following unfortunateness:

>>> vd = Vividict()
>>> 1 in vd
False
>>> 2 in vd[1]
False
>>> 1 in vd
True

How can I avoid that misleading result? In Perl, by the way, I can get the desired behavior by doing

no autovivification qw/exists/;

And basically I'd like to replicate that behavior in Python if possible.

Original Q&A

There are 2 answers

**kindall** · Answer 1 · 2017-02-08T20:07:42+00:00

This is not an easy problem to solve, because in your example:

my_dict[1][2] = 3

my_dict[1] results in a __getitem__ call on the dictionary. There is no way at that point to know that an assignment is being made. Only the last [] in the sequence is a __setitem__ call, and it can't succeed unless mydict[1] exists, because otherwise, what object are you assigning into?

So don't use autovivication. You can use setdefault() instead, with a regular dict.

my_dict.setdefault(1, {})[2] = 3

Now that's not exactly pretty, especially when you are nesting more deeply, so you might write a helper method:

class MyDict(dict):
    def nest(self, keys, value):
       for key in keys[:-1]:
          self = self.setdefault(key, {})
       self[keys[-1]] = value

 my_dict = MyDict()
 my_dict.nest((1, 2), 3)       # my_dict[1][2] = 3

But even better is to wrap this into a new __setitem__ that takes all the indexes at once, instead of requiring the intermediate __getitem__ calls that induce the autovivication. This way, we know from the beginning that we're doing an assignment and can proceed without relying on autovivication.

class MyDict(dict):
    def __setitem__(self, keys, value):
       if not isinstance(keys, tuple):
           return dict.__setitem__(self, keys, value)
       for key in keys[:-1]:
          self = self.setdefault(key, {})
       dict.__setitem__(self, keys[-1], value)

my_dict = MyDict()
my_dict[1, 2] = 3

For consistency, you could also provide __getitem__ that accepts keys in a tuple as follows:

def __getitem__(self, keys):
   if not isinstance(keys, tuple):
       return dict.__getitem__(self, keys)
   for key in keys:
       self = dict.__getitem__(self, key)
   return self

The only downside I can think of is that we can't use tuples as dictionary keys as easily: we have to write that as, e.g. my_dict[(1, 2),].

**dhke** · Answer 2 · 2017-02-08T21:24:23+00:00

The proper answer is: don't do this in Python, since explicit is better than implicit.

But if you really want autovivification that does not keep empty sub-dictionaries, one can emulate the behavior in Python.

try:
    from collections import MutableMapping
except:
    from collections.abc import MutableMapping


class AutoDict(MutableMapping, object):
    def __init__(self, *args, **kwargs):
        super(AutoDict, self).__init__()
        self.data = dict(*args, **kwargs)

    def __getitem__(self, key):
        if key in self.data:
            return self.data.__getitem__(key)
        else:
            return ChildAutoDict(parent=self, parent_key=key)

    def __setitem__(self, key, value):
        return self.data.__setitem__(key, value)

    def __delitem__(self, key):
        return self.data.__delitem__(key)

    def __iter__(self):
        return self.data.__iter__()

    def __len__(self):
        return self.data.__len__()

    def keys(self):
        return self.data.keys()

    def __contains__(self, key):
       return data.__contains__(key)

    def __str__(self):
        return str(self.data)

    def __unicode__(self):
        return unicode(self.data)

    def __repr__(self):
        return repr(self.data)

class ChildAutoDict(AutoDict):
    def __init__(self, parent, parent_key):
        super(ChildAutoDict, self).__init__()
        self.parent = parent
        self.parent_key = parent_key

    def __setitem__(self, key, value):
        if self.parent is not None and not self.parent_key in self.parent:
            # if parent got a new key in the meantime,
            # don't add ourselves
            self.parent.data[self.parent_key] = self
        else:
           self.parent = None
        return self.data.__setitem__(key, value)

    def __delitem__(self, key):
        ret = self.data.__delitem__(key)
        # only remove ourselves from the parent if we are 
        # still occupying our slot.
        if not self and self.parent and self is self.parent[parent_key]:
            self.parent.data.pop(self.parent_key)
        return ret

What you get back from the __getitem__() is essentially a dictionary facade that adds itself to the parent dictionary only if itself is not empty and removes itself once it becomes empty.

All of this --of course-- stops working once you assign a "normal" dictionary somewhere in the middle, i.e. d[2] = {}, d[2][3] = {} doesn't work any more and so on.

I have not really tested this thoroughly, so beware of more pitfalls.

d = AutoDict()

print(1 in d)
>>> False
print(d)
>>> {}

print(d[2][3])
>>> {}
print(d[2])
>>> {}
print(d)
>>> {}

d[2][3] = 1
print(d)
>>> {2: {3: 1}}

del d[2][3]
print(d)
>>> {}

TechQA.

How to implement autovivification for nested dictionary ONLY when assigning values?

There are 2 answers

Related Questions in PYTHON

Related Questions in DICTIONARY

Related Questions in AUTOVIVIFICATION

Popular Questions

Trending Questions