Loop through rows in pandas dataframe and update check on the column condition

Question

Loop through rows in pandas dataframe and update check on the column condition

266 views Asked by Shinilrk At 15 April 2021 at 20:38

Row_Number	ColumnA	ColumnB
1	Data 1	A
2	Data A
3	Data B
4	Data 2	B
5	Data C
6	Data D
7	Data E
8	Data 3	C

this is how my data updated on a dataframe from another table, In this "Data 1" having a continuation as "Data A" and "Data B" and having only one data as on column B as "A", but in data frame all the "Data 1", "Data A", "Data B" are created in separate rows. Is there any operations are available to concatenate all the "Row_number" 1,2,3 to be appended on a single line and having only value as "A" in columnB?

Expected output is below:

Row_Number	ColumnA	ColumnB
1	Data1,Data A, Data B	A
2	Data2,Data C, Data D,Data E	B
3	Data3	C

Thanks in Advance

I'm new to python and have tried the below code

# variable that stores the values: table_values
import pandas as pd

df=pd.DataFrame(table_values,columns=['ColumnA','ColumnB'])

for index, row in df.iterrows():
  if df.loc[index,'ColumnB'] == '' & df.loc[index,'ColumnA'] != '':
    df.loc[index-1, 'ColumnA'] = df.loc[index-1, 'ColumnA'] + df.loc[index, 'ColumnA']  
print(df)'''

Original Q&A

There are 1 answers

**Tom** · Answer 1 · 2021-04-15T20:49:44+00:00

You could do a groupby here. The simplest is to just use Column B, since that already corresponds to the rows you want to group. So given this data:

import pandas as pd

df = pd.DataFrame({'ColumnA': ['Data1', 'DataA', 'DataB',
                               'Data2', 'DataC', 'DataD', 'DataE',
                               'Data3'],
                   'ColumnB': ['A', None, None,
                               'B', None, None, None,
                               'C']})

#  ColumnA ColumnB
# 0   Data1       A
# 1   DataA    None
# 2   DataB    None
# 3   Data2       B
# 4   DataC    None
# 5   DataD    None
# 6   DataE    None
# 7   Data3       C

Create something to group with, by filling the missing values:

grouper = df['ColumnB'].ffill()

# 0    A
# 1    A
# 2    A
# 3    B
# 4    B
# 5    B
# 6    B
# 7    C
# Name: ColumnB, dtype: object

Then aggregate to concat strings, following this post:

output = df['ColumnA'].groupby(grouper).apply(lambda x : ', '.join(x)).reset_index()

Final result:

  ColumnB                     ColumnA
0       A         Data1, DataA, DataB
1       B  Data2, DataC, DataD, DataE
2       C                       Data3

TechQA.

Loop through rows in pandas dataframe and update check on the column condition

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in PANDAS

Related Questions in DATAFRAME

Related Questions in AUTOMATIONANYWHERE

Popular Questions

Popular Tags

Trending Questions