I have a large list of arrays (data type float32, with a few instances of int) in Python that is to be serialized via UTF-8 encoding and saved on a server. However I'm having issues with the size of the saved file exceeding storage limits.
The available serialization formats that the server can handle is: string, bytes, JSON, and XML. Which of these data formats would be best for saving the data structure?
I think you should also explore using h5 files. I'd just honestly convert list of lists into a numpy array, and then save the numpy array to an h5 file.
You can save like :->
You can load like:
H5 files are optimized for storing matrix-like data.
If you need some helper scripts, try out some I made here(shameless plug): https://github.com/ss4328/h5_manager_scripts
If h5 files are not. your dig, JSON is the best implementation. Tons of free libraries to generate/use the data. Pretty sleek on the web, and not to mention they're human readable too. XML is a bit outdated.