Combine Two Polars Dataframe in a complex way

Question

Combine Two Polars Dataframe in a complex way

102 views Asked by WannaBeAProgrammer At 03 November 2023 at 17:54

I have a dataframe with transactions and another dataframe with employees who are assigned to those transactions. Each transaction can have 0...N assignments. I would like to have one dataframe which consists of transaction ids and all of the employee assignments in separate columns. Please see example below:

I have a one dataframe as follows:

TransactionIds	Other_Columns..
T1	Cell 2...These don't matter
T2	Cell 4...These don't matter
T3	Cell 4...These don't matter

I have another dataframe as follows:

TransactionIds	Assignments
T1	Assignment1
T1	Assignment2
T1	Assignment3
T2	Assignment3
T2	Assignment4
T3	Assignment6
T4	NULL

I would like to have a dataframe which looks like as follows:

TransactionIds	Assignment1	Assignment2	Assignment3	AssignmentN
T1	Assignment1	Assignment2	Assignment3	NULL
T2	Assignment3	Assignment4	NULL	NULL
T3	NULL	NULL	NULL	NULL

I tried group by and then agg() function. However, it gives me a list which I don't know how to convert to columns. Another problem with this approach is I wouldn't know how many columns to convert this list into. I would like to dynamically determine/create the number of assignment columns from the SELECT.

Original Q&A

There are 1 answers

**WannaBeAProgrammer** · Accepted Answer · 2023-11-03T23:07:51+00:00

I was able to resolve this by doing the following:

    let stns: Vec<String> = (1..6).map(|i| format!("Station {i}")).collect();
    let weather = df!(
            "station"=> &stns,
            "temperatures"=> &[
                "20 5 5 E1 7 13 19 9 6 20",
                "18 8 16 11 23 E2 8 E2 E2 E2 90 70 40",
                "19 24 E9 16 6 12 10 22",
                "E2 E0 15 7 8 10 E1 24 17 13 6",
                "14 8 E0 16 22 24 E1",
            ],
    )
    .unwrap();
    println!("{}", &weather);

    let out = weather
        .clone()
        .lazy()
        .with_columns([col("temperatures").str().split(lit(" "))])
        .with_column(col("temperatures").list().lengths().alias("counts"))
        .with_column(col("counts").max())
        .collect()
        .unwrap();

    println!("{}", &out);

    let max_is = out.column("counts").unwrap().u32().unwrap().max();

    println!("max_is: {:?}", max_is);

    let mut selects = vec![];

    for i in 0..max_is.unwrap() {
        selects.push(col("temperatures").list().get(lit(i)).alias(format!("Assignment_{}", i + 1).as_str()));
    }

    println!("Selects: {:#?}", selects);

    let out12 = out.lazy().with_columns(&selects).drop_columns(["foo"]).collect().unwrap();
    println!("{}", &out12);

TechQA.

Combine Two Polars Dataframe in a complex way

There are 1 answers

Related Questions in RUST

Related Questions in RUST-POLARS

Popular Questions

Popular Tags

Trending Questions