I am trying to do something similar to this question here but instead of using the polars library, I will like to use the Datafusion library
The idea is to go from a vec of struct like this:
#[derive(Serialize)]
struct Test {
id:u32,
amount:u32
}
and save to Parquet files, just like in the question I referenced.
While it was possible using polars, as seen in the accepted answer to achieve this by going from the Struct, serialise to JSON and then build the Dataframe from that, I could not find similar approach using Datafusion.
Any suggestions will be appreciated.
I think the parquet_derive is designed exactly for the usecase of writing Rust structs to/from Parquet files. DataFusion would be useful if you wanted to process the resulting data, for example filtering or aggregating it with SQL
Here is an example in the docs: https://docs.rs/parquet_derive/30.0.1/parquet_derive/derive.ParquetRecordWriter.html