How to display an UTF8 column content using Apache Arrow Rust Crate?

Question

How to display an UTF8 column content using Apache Arrow Rust Crate?

449 views Asked by Jeremy Cochoy At 07 January 2025 at 19:42

I managed to load a parquet file based on example & documentation of rust's apache::arrow implementation.

    use parquet::arrow::{ParquetFileArrowReader, ArrowReader};
    use std::rc::Rc;
    use arrow::record_batch::RecordBatchReader;

    let file = File::open(&Path::new("./path_to/file.parquet")).unwrap();
    let file_reader = SerializedFileReader::new(file).unwrap();
    let mut arrow_reader = ParquetFileArrowReader::new(Rc::new(file_reader));

    println!("Converted arrow schema is: {}", arrow_reader.get_schema().unwrap());

    let mut record_batch_reader = arrow_reader.get_record_reader(2048).unwrap();

I was able to display the name and type of columns of each batch:

    loop {
       let record_batch = record_batch_reader.next_batch().unwrap().unwrap();
       if record_batch.num_rows() > 0 {
           println!("Schema: {}.", record_batch.schema());
       }
    }

but I am quite confused on how to display the content of the columns. How can I retrieve the content of the first column and print it?

Original Q&A

There are 1 answers

**Jeremy Cochoy** · Answer 1 · 2020-10-12T10:14:43+00:00

The last version of apache arrow seams to have a prettifyer class. Unfortunately this is not in the last available package (1.0.1).

use arrow::util::pretty;
pretty::print_batches(&batch);

The manual way to do it is through downcasting.

// For an int:
let col = batch.column(0).as_any().downcast_ref::<arrow::array::Int32Array>();

// For a Utf8 string:
let col = batch.column(0).as_any().downcast_ref::<arrow::array::StringArray>();

Then you can simply print it:

println!("Columns: {:?}.", col);

TechQA.

How to display an UTF8 column content using Apache Arrow Rust Crate?

There are 1 answers

Related Questions in RUST

Related Questions in APACHE-ARROW

Popular Questions

Popular Tags

Trending Questions