Limit of pandas HDFStore

596 views Asked by At

I am planning to use Pandas HDFStore as temporary file for out of core csv operations.

(csv --> HDFStore --> Out of core operation in pandas).

Just wondering :

  • Limit in size of HDF5 for real life practical usage on 1 machine (not the theoritical one....)

  • Cost of operation for pivot tables (100 columns, fixed VARCHAR, numerical).

  • Whether I would need to switch to Postgres (load csv into Postgres) and DB stuff...

Tried to find on google some benchmark limit size vs computation time for HDF5, but could not find any.

Total size of csv is around 500Go - 1To (uncompressed).

0

There are 0 answers