To iterate all the dataframe in pandas

70 views Asked by At

I'm trying to do something similar to this...

My intention is to create a boucle for in pandas that can iterate all the dataframe filtering all the rows that are highest than four. If the condition is satisfied, it will give me a new column with the column name and the ID. Something like this (the column output):

enter image description here

I'm trying with this code but it doesn't work...

list = []
for col in df.columns:
    for row in df[col]:
        if row>4:
            list.append(df(row).index, col)

Could somebody help me? I will thanks you so much...

1

There are 1 answers

0
Timeless On

Here is a proposition with pandas.DataFrame.loc and pandas.Series.ge :

collected_vals = []​
for col in df.filter(like="X").columns:
    collected_vals.append(df.loc[df[col].ge(4), "ID"].astype(str).radd(f"{col}, "))

#if list​ is needed
from itertools import chain​
l = list(chain(*[ser.tolist() for ser in collected_vals]))

#if Series is needed
ser = pd.concat(collected_vals, ignore_index=True)

#if DataFrame is needed
out_df = pd.concat(collected_vals, ignore_index=True).to_frame("OUTPUT")

# Output

print(out_df)

      OUTPUT
0  X40, 1100
1  X40, 1200
2   X50, 700
3   X50, 800
4   X50, 900

Input used :

print(df)

   X40  X50    ID
0    1    5   700
1    2    6   800
2    1    8   900
3    3    2  1000
4    4    3  1100
5    6    1  1200