I have Apache Arrow data on the server (Python) and need to use it in the browser. It appears that Arrow Flight isn't implemented in JS. What are the best options for sending the data to the browser and using it there?
I don't even need it necessarily in Arrow format in the browser. This question hasn't received any responses, so I'm adding some additional criteria for what I'm looking for:
- Self-describing: don't want to maintain separate schema definitions
- Minimal overhead: For example, an array of float32s should transfer as something compact like a data type indicator, length value and sequence of 4-byte float values
- Cross-platform: Able to be easily sent from Python and received and used in the browser in a straightforward way
Surely this is a solved problem? If it is I've been unable to find a solution. Please help!
Building off of the comments on your original post by David Li, you can implement a non-streaming version what you want without too much code using PyArrow on the server side and the Apache Arrow JS bindings on the client. The Arrow IPC format satisfies your requirements because it ships the schema with the data, is space-efficient and zero-copy, and is cross-platform.
Here's a toy example showing generating a record batch on server and receiving it on the client:
Server:
Client
Edit: I added a runnable example at https://github.com/amoeba/arrow-python-js-ipc-example.