How to update and concat at the same time in polars?

Question

How to update and concat at the same time in polars?

50 views Asked by Alena Volkova At 12 October 2023 at 14:24

I have two polars dataframes of the same length:

import polars as pl

df1 = pl.DataFrame({'a': [1, 2, 3], 'b': [4, None, None]})
df2 = pl.DataFrame({'b': [None, 5, None], 'c': [6, 7, 8]})

df1
┌─────┬──────┐
│ a   ┆ b    │
╞═════╪══════╡
│ 1   ┆ 4    │
│ 2   ┆ null │
│ 3   ┆ null │
└─────┴──────┘

df2
┌──────┬─────┐
│ b    ┆ c   │
╞══════╪═════╡
│ null ┆ 6   │
│ 5    ┆ 7   │
│ null ┆ 8   │
└──────┴─────┘

I want to add df2 to df1 in a way that the columns that already exist in df1 get updated with values from df2, and the columns that are only in df2 get added to df1:

┌─────┬──────┬─────┐
│ a   ┆ b    ┆ c   │
╞═════╪══════╪═════╡
│ 1   ┆ 4    ┆ 6   │
│ 2   ┆ 5    ┆ 7   │
│ 3   ┆ null ┆ 8   │
└─────┴──────┴─────┘

The best I got is:

df1.update(df2).hstack(df2.select([c for c in df2.columns if c not in df1.columns]))

Is there a better way?

Original Q&A

There are 1 answers

**Dean MacGregor** · Answer 1 · 2023-10-12T15:25:15+00:00

If you need to customize how the update works, here's a way to manually do the update so you can tweak certain aspects of it if you need to.

overlaps = set(df1.columns).intersection(df2.columns)
(
    df1
    .with_columns(
        df2.rename({x:f"{x}_update" for x in overlaps}).to_struct('df2')
        )
    .unnest('df2')
    .with_columns(**{x:pl.coalesce(f"{x}_update",x) for x in overlaps})
    .select(set(df1.columns+ df2.columns))
    )

If you don't need to tweak the update then just use the built in way though.

TechQA.

How to update and concat at the same time in polars?

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTHON-POLARS

Popular Questions

Popular Tags

Trending Questions