Embedded nosql open source java database

4.6k views Asked by At

I'm developing an open source product and need an embedded dbms. Can you recommend an embedded open source database that ...

  • Can handle objects over 10 GB each
  • Has a license friendly to embedding (LGPL, not GPL).
  • Is pure Java
  • Is (preferably) nosql. Sql might work, but prefer nosql

I've looked over some of the document DBMSs, like mongodb, but they seem to be limited to 4 or 16 mb documents.

Berkeley DB looked attractive but has a GPL like license.

Sqlite3 is attractive: good license, and you can compile with whatever max blob size you like. But, it's not Java. I know JDBC drivers exist, but we need a pure Java system.

Any suggestions?

Thanks

Steve

1

There are 1 answers

1
Martin Dow On

Although it's an old question, I've been looking into this recently and have come across the following (at least two of which were written after this question was asked). I'm not sure how any of these handle very large objects - and at 10GB you would probably have to do some serious testing, as I presume few database developers would have objects of that size in mind for their products (just a guess). I would definitely consider storing them to disk directly, with just a reference to the file location in your database.

(Opinions below are all pretty superficial, by the way, as I haven't used them in earnest yet).


OrientDB looks like the most mature of the three I found. It appears to be a document and/or graph database and claims to be very fast (making use of and "RB+Tree" data structure - a combination of B+ and Red Black trees). It claims to be super fast and light, with no external dependencies. There seems to be an active community developing it, with lots of commits over the last few days, for example. It's also compliant with TinkerPop graph database standard, which adds another layer of features (such as the Gremlin graph querying language). It's ACID compliant, has REST and other external APIs and even a web based management app (which presumably could be deployed with your embedded DB, but I'm not sure).

The next two fall more into the simple key-value store camp of N(ot)O(nly)SQL world.

JDBM3 is an extremely minimal data store: it has a hash map, tree map, tree set and linked list which are written to disk through memory mapped files. It claims to be very light and fast, is fully transactional and is being actively developed.

HawtDB looks similary very simple and fast - a BTree or Hash based index persisted to disk with memory mapped files. It's (optionally) fully transactional. There has been no commit in the past seven months (to end March 2012) and there's not much activity on the mailing list. That's not to say it's not a good library, but worth mentioning.

JDBM3 and HawtDB are pretty minimal, so you're not going to get any fancy GUIs. But I think they both look very attractive for their speed and simplicity.


Those are all I've found matching your requirements. In addition, Neo4J is great - a graph database, which is now a pretty mature and works very well in embedded mode. It's GPL/AGPL licensed, though, so may require a paid license, unless you can open source your code too: http://neotechnology.com/products/price-list/

Of course, you could also use the H2 SQL database with one big table and no indices!