Row_Number | ColumnA | ColumnB |
---|---|---|
1 | Data 1 | A |
2 | Data A | |
3 | Data B | |
4 | Data 2 | B |
5 | Data C | |
6 | Data D | |
7 | Data E | |
8 | Data 3 | C |
this is how my data updated on a dataframe from another table, In this "Data 1" having a continuation as "Data A" and "Data B" and having only one data as on column B as "A", but in data frame all the "Data 1", "Data A", "Data B" are created in separate rows. Is there any operations are available to concatenate all the "Row_number" 1,2,3 to be appended on a single line and having only value as "A" in columnB?
Expected output is below:
Row_Number | ColumnA | ColumnB |
---|---|---|
1 | Data1,Data A, Data B | A |
2 | Data2,Data C, Data D,Data E | B |
3 | Data3 | C |
Thanks in Advance
I'm new to python and have tried the below code
# variable that stores the values: table_values
import pandas as pd
df=pd.DataFrame(table_values,columns=['ColumnA','ColumnB'])
for index, row in df.iterrows():
if df.loc[index,'ColumnB'] == '' & df.loc[index,'ColumnA'] != '':
df.loc[index-1, 'ColumnA'] = df.loc[index-1, 'ColumnA'] + df.loc[index, 'ColumnA']
print(df)'''
You could do a
groupby
here. The simplest is to just useColumn B
, since that already corresponds to the rows you want to group. So given this data:Create something to group with, by filling the missing values:
Then aggregate to concat strings, following this post:
Final result: