Python dictionary size vs object size efficiency

335 views Asked by At

Could someone explain what happens with the memory behind the scenes when manipulating with dictionary and objects in the following example :

In [52]: class O(object):
....:         var1 = 'asdfasdfasfasdfasdfasdfasdf'
....:         var2 = 255
....: 

In [53]: dt = {'var1': 'asdfasdfasfasdfasdfasdfasdf', 'var2': 255}

In [55]: o = O()

In [57]: sys.getsizeof(o)
Out[57]: 64

In [58]: sys.getsizeof(dt)
Out[58]: 280

Next thing is weird according to above values

In [68]: sys.getsizeof(o.var1)
Out[68]: 64

In [69]: sys.getsizeof(o.var2)
Out[69]: 24

In [70]: sys.getsizeof(dt['var1'])
Out[70]: 64

In [71]: sys.getsizeof(dt['var2'])
Out[71]: 24

The values in data structures are the same size, but the difference between types makes me wonder what happens on behind the scenes?

Does the example makes objects more effective over dictionaries?

I use Ubuntu 14.04 and Python 2.7.6

2

There are 2 answers

0
jonrsharpe On BEST ANSWER

Note that sys.getsizeof gives you the size of the object itself, but that's not the whole story. An object has various attributes that also contribute to the overall memory footprint. For example, an instance of a class has a __dict__, which holds the values of its attributes:

>>> o = O()
>>> o.__dict__
{}
>>> sys.getsizeof(o.__dict__)
140

Note three interesting things:

  1. This is also a dictionary - this data structure is used a lot under the hood in Python, and is extremely well-optimised as a result;
  2. There is nothing in o.__dict__, because var1 and var2 are class attributes, stored on O, not instance attributes; and
  3. Even though there's nothing in o.__dict__, it's still the same size as dt, because dictionaries are initialised with enough space for (IIRC) eight keys to avoid frequent resizing as you add items to them (for more information on the implementation of dictionaries, see "The Mighty Dictionary").

Note also that if we compare the size of the instance plus the class in both cases, which is a fairer comparison, the gap narrows:

>>> sys.getsizeof(o) + sys.getsizeof(O)
484
>>> sys.getsizeof(dt) + sys.getsizeof(dict)
576

Does the example makes objects more effective over dictionaries?

Not at all; for one thing, as I've shown, objects are generally implemented using dictionaries (there is a way to not create a __dict__ for each instance, by defining __slots__ of pre-defined attributes on the class, but I won't get into that here) and dictionaries are objects in themselves (although the built-in types are slightly different for reasons I won't dwell on)!

In general, don't worry about the memory details unless it becomes a problem - define a class if you need state and behaviour (attributes and methods) and use a dictionary if you only need the state.

1
matino On

Simple comparison of both objects indicate that the dictionary is far more complex object than a simple class:

>>> dt = {}
>>> dir(dt)
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__',  
'__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__',  
'__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__',  
'__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__',  
'__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get',  
'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem',  
'setdefault', 'update', 'values', 'viewitems', 'viewkeys', 'viewvalues']

against:

>>> class O(object):
...   var1 = 'asdfasdfasfasdfasdfasdfasdf'
...   var2 = 255
... 
>>> o = O()
>>> dir(o)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', 
 '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__',  
'__reduce_ex__', '__repr__', '__setattr__', '__sizeof__',  
'__str__', '__subclasshook__', '__weakref__', 'var1', 'var2']