How to read arrow ipc data in Rust

105 views Asked by At

I'm trying to translate a working Python code into Rust. This code receives a JSON payload (the event object in the following snippet), the entries are the following:

  • input_schema: a string containing the base64 encoded input schema in apache-arrow format.
  • output_schema: a string containing the base64 encoded output schema in apache-arrow format.
  • input_records: a string containing the base64 data (record batches)

The working Python snippet is the following:

def etract_record_batches(cls, event):
    input_schema = pa.ipc.read_schema(
        pa.BufferReader(base64.b64decode(event["input_schema"]))
    )
    output_schema = pa.ipc.read_schema(
        pa.BufferReader(base64.b64decode(event["output_schema"]))
    )
    record_batch = pa.ipc.read_record_batch(
        pa.BufferReader(
            base64.b64decode(event["input_records"])
        ),
        input_schema,
    )
    record_batch_list = record_batch.to_pylist()
    return record_batch_list

How could I convert this snipped to Rust? I am still having problems in the beggining of the process, on how to extract the schema:

pub async fn extract_payload(&self) -> anyhow::Result<CustomerPayload> {
    let input_schema_bytes = base64::engine::general_purpose::STANDARD.decode(&self.input_schema)?;
    print!("input schema bytes: {:?}", input_schema_bytes);
    
    let (input_schema, input_ipc_schema) = arrow2::io::ipc::read::deserialize_schema(&input_schema_bytes[..])?;
    print!("input schema: {:?}", input_schema);
    print!("input ipc schema: {:?}", input_ipc_schema);
    
    ...

That results in the following error, while running deserialize_schema.

In <Message@84>::header(): Invalid vtable length (length = 8)
0

There are 0 answers