Opinions on my data storage problem (database/homebrew solution)

153 views Asked by At

I have very simply structured data which is currently stored in a home-brew file format, but I am wondering whether we should migrate to something more modern. The data is simply a table of doubles, indexed by a double column. The things I need to perform are:

  • Iterating through the table.
  • Insertion and deletion of arbitrary records.
  • Selecting a given number of rows before and after a given key value (where the key might not be in the database).

The requirements are:

  • The storage must be file-based without a server.
  • It should not be necessary to read the whole file into memory.
  • The resulting file should be portable between different architectures (wrt endian-ness...)
  • Must be a very stable project (the data is highly critical).
  • Must run on Solaris/SPARC and preferably also on Linux/x64.
  • Access times should be as fast as possible.
  • Must be available as a C++ library. Bonus points for Fortran and Python bindings :)
  • Optional higher precision number representation than double precision would be a bonus.
  • Relatively compact storage size would also be a bonus.

From my limited experience, sqlite would be an interesting choice, or perhaps mysql in a non-server mode if sqlite is not fast enough. But perhaps a full-fledged SQL database is overkill?

What do you suggest?

1

There are 1 answers

1
Thomas On BEST ANSWER

SQLite meets nearly all of your requirements, and it's not that hard to use. Give it a try!

  • It's file-based, and the entire database is a single file.

  • It does not need to read the entire file into memory. Database size might be limited; you should check here if the limits will be a problem in your situation.

  • The format is cross-platform:

    SQLite databases are portable across 32-bit and 64-bit machines and between big-endian and little-endian architectures.

  • It's been around for a long time and is used in many places, and is generally considered mature and stable.

  • It's very portable and runs on Solaris/SPARC and Linux/x64.

  • It's faster than MySQL (grains of salt present behind that link, though) or other such database servers, because only one client needs to be taken into account.

  • There is a C++ API and a Python binding and a Fortran wrapper.

  • There is no arbitrary-precision column type, but NUMERIC will be silently converted to text if it cannot be exactly represented:

    For conversions between TEXT and REAL storage classes, SQLite considers the conversion to be lossless and reversible if the first 15 significant decimal digits of the number are preserved. If the lossless conversion of TEXT to INTEGER or REAL is not possible then the value is stored using the TEXT storage class.

  • Compact storage of the database, I'm not sure of. But I've never heard any claims that SQLite would be particularly wasteful.