I am struggling with the following points:
- When should bcolz be used instead of keras' data generator? Looks like the keras'
model
has apis to accept an array with batch or define the data generator as well. - Is there a performance improvement when using bcolz with
fit()
api over using a data generator withfit_generator()
?
Finally, there's a fastai post mentioning dask at this post
- Is dask better than bcolz?
Thanks!
Is Dask better than bcolz? Dask isn't strictly an alternative for bcolz, Dask can work with bcolz arrays. And in tasks with huge datasets, it can provde a speed up because it has great support for parallelism. Bcolz is a nice compressed data container and I'd suggest using dask on top of bcolz if you need that speed up.