What available Python modules are there to save-and-load data?

797 views Asked by At

There are many scattered posts out on StackOverflow, regarding Python modules used to save and load data.

I myself am familiar with json and pickle and I have heard of pytables too. There are probably more out there. Also, each module seems to fit a certain purpose and has its own limits (e.g. loading a large list or dictionary with pickle takes ages if working at all). Hence it would be nice to have a proper overview of possibilities.

Could you then help providing a comprehensive list of modules used to save and load data, describing for each module:

  • what the general purpose of the module is,
  • its limits,
  • why you would choose this module over others?
2

There are 2 answers

4
qiao On

marshal:

  • Pros:

    • Can read and write Python values in a binary format. Therefore it's much faster than pickle (which is character based).
  • Cons:

    • Not all Python object types are supported. Some unsupported types such as subclasses of builtins will appear to marshal and unmarshal correctly
    • Is not intended to be secure against erroneous or maliciously constructed data.
    • The Python maintainers reserve the right to modify the marshal format in backward incompatible ways should the need arise

shelve

  • Pros:

    • Values in a shelf can be essentially arbitrary Python objects
  • Cons:

    • Does not support concurrent read/write access to shelved objects

ZODB (suggested by @Duncan)

  • Pro:

    • transparent persistence
    • full transaction support
    • pluggable storage
    • scalable architecture
  • Cons

    • not part of standard library.
    • unable (easily) to reload data unless the original python object model used for persisting is available (consider version difficulties and data portability)
3
Gandaro On

There is an overview of the standard lib data persistence modules.