Single-file, persistent, sorted key-value store for Java (alternative to Berkeley DB)

11.5k views Asked by At

Berkeley DB (JE) licensing may be a deal killer. I have a Java application going to a small set of customers but as it is a desktop application, my price cannot support individual instance licensing.

Is there a recommended Java alternative to Berkeley DB? Commercial or otherwise (good key-value store implementations can get non-trivial, I prefer to defer maintenance elsewhere). I need more than just a hash store as I'll need to iterate through subsequent key subsets and basic hash stores would O(m*n) that search and I expect the store to be ~50-60GiB on a desktop machine. Added benefit anyone that you can recommend that keeps its backing store in a single file?

10

There are 10 answers

2
Edwin Buck On

--- Edited after seeing the size of the file ---

50 to 60 GiB files! It seems that you would have to know that your DB engine didn't load all of that in memory at once, and was very efficient in handling / scavenging off-loaded data backing blocks.

I don't know if Cloudscape is up to the task, and I wouldn't be surprised if it wasn't.

--- original post follows ---

Cloudscape often fits the bill. It's a bit more than Berkeley DB, but it gained enough traction to be distributed even with some JDK offerings.

3
JPelletier On

I think SQLite is exactly what you want: Free (Public Domain), Single File Database, Zero-Configuration, Small Footprint, Fast, cross-platform, etc.. Here is a list of wrappers, there is a section for Java. Take a look to sqlite4java and read more on Java + SQLite here.

3
mdrg On

It won't be a single file, but if you want embedded database, I suggest Java DB (a rebranded version of Apache Derby, which I used in a previous job with wonderful results).

Plus, both are completely free.

Edit: reading the other comments, another note: Java DB/Derby is 100% Java.

0
harschware On

Consider ehcache. I show here a class for wrapping it as a java.util.Map. You can easily store Lists or other data structures as your values, avoiding the O(m*n) issue you are concerned with. ehcache is Apache 2.0 license, with an commercial enterprise version available by Terracotta. The open source version will allow you to spill your cache to disk, and if you choose not to evict cache entries it is effectively a persistent key-value store.

0
Simon Brandhof On

Persistit is the new challenger. It's a fast, persistent and transactional Java B+Tree library.

I'm afraid that there's no guarantee that it will still be maintained. Akiban, the company supporting Persistit, was recently acquired by FoundationDB. The latter did not provide any information on the future.

https://github.com/akiban/persistit

0
Tom Anderson On

JavaDB aka Derby aka Cloudscape would be a decent choice; it's a pure Java SQL database, and it's included in the JRE, so you don't have to ship it with your code or require users to install it separately.

(It's actually not included in the JRE provided by some Linux package managers, but there it will be a separate package that is trivial to install)

However, Derby has fairly poor performance. An alternative would be H2 - again, a pure Java SQL database that stores a database in a single file, with a ~1MB jar under a redistributable license, but one that is considerably faster and lighter than Derby.

I've happily used H2 for a number of small projects. JBoss liked it enough that they bundled it in AS7. It's trivial to set up, and definitely worth a try.

0
mbelow On

I just would like to point out that the storage backend of H2 can also be used as a key-value storage engine if you do not need sql / jdbc:

http://www.h2database.com/html/mvstore.html

0
Maurice Perry On

H2 http://www.h2database.com/

It's a full-blown SQL/JDBC database, but it's lightweight and fast

0
leventov On

Take a look at LMDBJava, Java bindings to LMDB, the fastest sorted ACID key-value store out there.

0
Andrejs On

You should definitely try JDBM2, it does what you want:

  • Disk backed HashMaps/TreeMaps thus you can iterate through keys.
  • Apache 2 license

In addition:

  • Fast, very small footprint
  • Transactional
  • Standalone jar have only 145 KB.
  • Simple usage
  • Scales well up to 1e9 records
  • Uses Java serialization, no ORM mapping

UPDATE

The project has now evolved into MapDB http://www.mapdb.org