Select column based on the value of another column Polars Python

Question

Select column based on the value of another column Polars Python

171 views Asked by Mustafa Elec At 17 October 2023 at 17:46

I have a df with ten columns and another column with its values are partial name of the ten columns. Here is a similar sample:

import polars as pl
df = pl.DataFrame({
    "ID"     :["A"  ,"B"  ,"C"  ] ,
    "A Left" :["W1" ,"W2" ,"W3" ] , 
    "A Right":["P1" ,"P2" ,"P3" ] , 
    "B Left" :["G1" ,"G2" ,"G3" ] , 
    "B Right":["Y1" ,"Y2" ,"Y3" ] , 
    "C Left" :["M1" ,"M2" ,"M3" ] , 
    "C Right":["K1" ,"K2" ,"K3" ] , 
    })
df
shape: (3, 7)
┌─────┬────────┬─────────┬────────┬─────────┬────────┬─────────┐
│ ID  ┆ A Left ┆ A Right ┆ B Left ┆ B Right ┆ C Left ┆ C Right │
│ --- ┆ ---    ┆ ---     ┆ ---    ┆ ---     ┆ ---    ┆ ---     │
│ str ┆ str    ┆ str     ┆ str    ┆ str     ┆ str    ┆ str     │
╞═════╪════════╪═════════╪════════╪═════════╪════════╪═════════╡
│ A   ┆ W1     ┆ P1      ┆ G1     ┆ Y1      ┆ M1     ┆ K1      │
│ B   ┆ W2     ┆ P2      ┆ G2     ┆ Y2      ┆ M2     ┆ K2      │
│ C   ┆ W3     ┆ P3      ┆ G3     ┆ Y3      ┆ M3     ┆ K3      │
└─────┴────────┴─────────┴────────┴─────────┴────────┴─────────┘

I want to add a column with its value selected from the other columns based on ID column like below:

shape: (3, 8)
┌─────┬────────┬─────────┬────────┬─────────┬────────┬─────────┬───────┐
│ ID  ┆ A Left ┆ A Right ┆ B Left ┆ B Right ┆ C Left ┆ C Right ┆ value │
│ --- ┆ ---    ┆ ---     ┆ ---    ┆ ---     ┆ ---    ┆ ---     ┆ ---   │
│ str ┆ str    ┆ str     ┆ str    ┆ str     ┆ str    ┆ str     ┆ str   │
╞═════╪════════╪═════════╪════════╪═════════╪════════╪═════════╪═══════╡
│ A   ┆ W1     ┆ P1      ┆ G1     ┆ Y1      ┆ M1     ┆ K1      ┆ W1-P1 │
│ B   ┆ W2     ┆ P2      ┆ G2     ┆ Y2      ┆ M2     ┆ K2      ┆ G2-Y2 │
│ C   ┆ W3     ┆ P3      ┆ G3     ┆ Y3      ┆ M3     ┆ K3      ┆ M3-K3 │
└─────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴───────┘

I got this result using melt:

df.join( df.melt(id_vars='ID').with_columns(
            pl.when(pl.col("ID") == pl.col("variable").str.slice(0,1)).then(pl.col("value"))
        ).select(["ID" , "value"]).drop_nulls().group_by("ID").agg(pl.col('value').str.concat()) 
        ,on='ID').sort("ID")

However, I need to avoid melt because I have two groups of ten columns beside other 50 columns.

I have tried using pl.col() and polars.selectors but I couldn't get the result.

import polars.selectors as cs
df.with_columns(
    cs.by_name(
        ( pl.concat_str([pl.col('ID') , " Left"] ) )
        ).alias("value")
)
TypeError: ColumnFactory.__new__() missing 1 required positional argument: 'name'

Any suggested solution ?

Thanks in advance.

Original Q&A

There are 1 answers

**jqurious** · Accepted Answer · 2023-10-17T18:12:10+00:00

It looks like you want to extract the "base" of the Left/Right columns.

There are various ways you could do that:

columns = pl.Series(df.select("^.+ (Left|Right)$").columns)
columns = columns.str.extract("(\S+)").unique()

shape: (3,)
Series: '' [str]
[
    "A"
    "B"
    "C"
]

You could then use pl.coalesce() to create a single column of the chosen when/then values:

df.with_columns(value = 
   pl.coalesce(
      pl.when(pl.col("ID") == col)
        .then(pl.col(f"{col} Left") + "-" + pl.col(f"{col} Right"))
      for col in columns
   )
)

shape: (3, 8)
┌─────┬────────┬─────────┬────────┬─────────┬────────┬─────────┬───────┐
│ ID  ┆ A Left ┆ A Right ┆ B Left ┆ B Right ┆ C Left ┆ C Right ┆ value │
│ --- ┆ ---    ┆ ---     ┆ ---    ┆ ---     ┆ ---    ┆ ---     ┆ ---   │
│ str ┆ str    ┆ str     ┆ str    ┆ str     ┆ str    ┆ str     ┆ str   │
╞═════╪════════╪═════════╪════════╪═════════╪════════╪═════════╪═══════╡
│ A   ┆ W1     ┆ P1      ┆ G1     ┆ Y1      ┆ M1     ┆ K1      ┆ W1-P1 │
│ B   ┆ W2     ┆ P2      ┆ G2     ┆ Y2      ┆ M2     ┆ K2      ┆ G2-Y2 │
│ C   ┆ W3     ┆ P3      ┆ G3     ┆ Y3      ┆ M3     ┆ K3      ┆ M3-K3 │
└─────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴───────┘

TechQA.

Select column based on the value of another column Polars Python

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTHON-POLARS

Popular Questions

Popular Tags

Trending Questions