How do I combine the values of a Pandas column from multiple dataframes into one column in one dataframe?

56 views Asked by At

The dataframes can be created using the following code:

import pandas as pd

s1 = pd.DataFrame({'item':['apple','apple','apple']}, 
                  index=['1/2/2024','1/5/2024','1/6/2024'])
s2 = pd.DataFrame({'item':['orange','orange','orange']}, 
                  index=['1/3/2024','1/5/2024','1/7/2024'])
s3 = pd.DataFrame({'item':['banana','banana','banana','banana']}, 
                  index=['1/4/2024','1/5/2024','1/6/2024','1/7/2024'])
print(s1)
print(s2)
print(s3)

The print output looks like this:

           item
1/2/2024  apple
1/5/2024  apple
1/6/2024  apple
            item
1/3/2024  orange
1/5/2024  orange
1/7/2024  orange
            item
1/4/2024  banana
1/5/2024  banana
1/6/2024  banana
1/7/2024  banana

I want to combine the item values (comma separated) from each dataframe based on the date index.

I would like the resulting dataframe to look like this:

                         item
1/2/2024                apple
1/3/2024               orange
1/4/2024               banana
1/5/2024  apple,orange,banana
1/6/2024         apple,banana
1/7/2024        orange,banana

It seems like a simple problem but sorry I am a newbie doing python part-time and I just can't figure it out. Thank you so much for any help that you can provide.

3

There are 3 answers

0
e-motta On

You can concatenate all three dataframes, group by the index, then join the rows in each group:

df = pd.concat([s1, s2, s3]).groupby(level=0).agg(lambda x: ",".join(x))
                         item
1/2/2024                apple
1/3/2024               orange
1/4/2024               banana
1/5/2024  apple,orange,banana
1/6/2024         apple,banana
1/7/2024        orange,banana
0
mozway On

concat and groupby.agg grouping on the index:

out = pd.concat([s1, s2, s3]).groupby(level=0).agg(','.join)

Output:

                         item
1/2/2024                apple
1/3/2024               orange
1/4/2024               banana
1/5/2024  apple,orange,banana
1/6/2024         apple,banana
1/7/2024        orange,banana
0
Kanak On

Concatenate DataFrames along the index:

result = pd.concat([s1, s2, s3], axis=0)

Group by index and aggregate the items into a comma-separated list:

result = result.groupby(result.index)['item'].agg(','.join)

Create a new DataFrame with the aggregated items:

result_df = pd.DataFrame({'item': result})