Use integer keys in Berkeley DB with python (using bsddb3)

780 views Asked by At

I want to use BDB as a time-series data store, and planning to use the microseconds since epoch as the key values. I am using BTREE as the data store type.

However, when I try to store integer keys, bsddb3 gives an error saying TypeError: Integer keys only allowed for Recno and Queue DB's.

What is the best workaround? I can store them as strings, but that probably will make it unnecessarily slower.

Given BDB itself can handle any kind of data, why is there a restriction? can I sorta hack the bsddb3 implementation? has anyone used anyother methods?

2

There are 2 answers

0
xcorat On BEST ANSWER

Well, there's no workaround. But you can use two approaches

  1. Store the integers as string using str or repr. If the ints are big, you can even use string formatting

  2. use cPickle/pickle module to store and retrieve data. This is a good way if you have data types other than basic types. For basics ints and floats this actually is slower and takes more space than just storing strings

0
amirouche On

You can't store integers since bsddb doesn't know how to represent integers and which kind of integer it is.

If you convert your integer to a string you will break the lexicographic ordering of keys of bsddb: 10 > 2 but as strings "10" < "2".

You have to use python struct to convert your integers into a string (or in python 3 into bytes) to store then store them in bsddb. You have to use bigendian packing or ordering will not be correct.

Then you can use bsddb's Cursor.set_range(key) to query for information in a given slice of time.

For instance, Cursor.set_range(struct.unpack('>Q', 123456789)) will set the cursor at the key of the even happening at 123456789 or the first that happens after.