How to add a custom type to dill's pickleable types

1.9k views Asked by At

I'm trying to serialize some code I did not write and cannot modify that needs to be pickled/dilled. The script contains a mongodb collection object---it isn't actually used later, but dilling it is throwing an error. When I try dilling it, I receive the error:

Collection object is not callable.  If you meant to call __getnewargs__ method on a 'Database' object it is failing because no such method exists.

I see code here that is enumerating the accepted types: https://github.com/uqfoundation/dill/blob/master/dill/_objects.py (lines 132-190) and my suspicion is this is where I might change something to allow a new type.

However, it's not clear to me what the intended interface is for adding a custom type. (Or maybe for pickling everything except that, is that possible or easier?)

1

There are 1 answers

5
matsjoyce On BEST ANSWER

No, the dill._objects module is just a list of types dill can and cannot pickle. Adding to that will just make dill think it can do more, while remaining the same functionally.

If you want to add a pickler, use dill.register (usally as a decorator). It takes a function which does the breaking down. E.g. given an unpicklable class:

class A:
    def __init__(self, a):
        self.a = a
    def __reduce__(self):
        raise GoAwayError()

Trying to pickle an instance of A will give you:

Traceback (most recent call last):
  File "d.py", line 9, in <module>
    dill.dumps(A(1))
  File "/home/matthew/GitHub/dill/dill/dill.py", line 192, in dumps
    dump(obj, file, protocol, byref, fmode)#, strictio)
  File "/home/matthew/GitHub/dill/dill/dill.py", line 182, in dump
    pik.dump(obj)
  File "/usr/lib/python3.4/pickle.py", line 410, in dump
    self.save(obj)
  File "/usr/lib/python3.4/pickle.py", line 497, in save
    rv = reduce(self.proto)
  File "d.py", line 7, in __reduce__
    raise GoAwayError()
NameError: name 'GoAwayError' is not defined

You can define a pickler like:

def recreate_A(a):
    return A(a)

@dill.register(A)
def save_A(pickler, obj):
    pickler.save_reduce(recreate_A, (obj.a,), obj=obj)

recreate_A is the function used for reconstruction, and (obj.a,) is a tuple of args which will be passed to your reconstructer function when loading.

This is probably the most flexible way of doing it, as you can use any function for recreate_A, including A.__init__ if you need to, but as you are trying to pickle a more complex type, you may need to do pre/post-processing. The functionality for skipping objects is still in the works, so you'll have to wait if you want to do it that way. If you want to achieve the same effect, you could just define recreate_A to return None, and take no args.