How to convert dataframe with strings into ndarray on Rust

178 views Asked by At

I'm facing with problem using Rust for converting Polars DataFrame with string values into ndarray without One Hot Encoding.

The example of code I used is the following:

println!("{:?}", _df.to_ndarray::<Float64Type>(Default::default()).unwrap());

Is there any solution for that?

1

There are 1 answers

0
Freeman On BEST ANSWER

I think you can use the apply method and iterate over each column in the DataFrame and convert it to a numeric representation.so the resulting DataFrame, df_numeric, will have numeric values instead of strings and finally use the to_ndarray method to convert the DataFrame to an ndarray, and the resulting ndarray, ndarray, will have Option type to handle missing values.

use polars::prelude::*;
use ndarray::prelude::*;

fn main() {
    //make a Polars DataFrame with string values
    let df = DataFrame::new(vec![
        Series::new("col1", &["a", "b", "c"]),
        Series::new("col2", &["x", "y", "z"]),
    ])
    .unwrap();

    //converting string columns to numeric representation
    let df_numeric = df.apply(|s: &Series| s.utf8().unwrap().as_ref().map(|v| v.get(0) as u32));

    //converting the DataFrame to an ndarray
    let ndarray: Array2<Option<u32>> = df_numeric
        .to_ndarray::<UInt32Type>(Default::default())
        .unwrap();

    println!("{:?}", ndarray);
}