I have the following dataframe:

srch_id    price    
1          30       
1          20       
1          25   
3          15
3          102
3          39

Now I want to create a third column in which I determine the price position grouped by the search id. This is the result I want:

srch_id    price    price_position
1          30       3
1          20       1
1          25       2
3          15       1
3          102      3
3          39       2

I think I need to use the transform function. However I can't seem to figure out how I should handle the argument I get using .transform():

def k(r):
    return min(r)

tmp = train.groupby('srch_id')['price']
train['min'] = tmp.transform(k)

Because r is either a list or an element?

2 Answers

5
anky_91 On Best Solutions

You can use series.rank() with df.groupby():

df['price_position']=df.groupby('srch_id')['price'].rank()
print(df)

   srch_id  price  price_position
0        1     30             3.0
1        1     20             1.0
2        1     25             2.0
3        3     15             1.0
4        3    102             3.0
5        3     39             2.0
2
andy On

is this:

df['price_position'] = df.sort_values('price').groupby('srch_id').price.cumcount() + 1


Out[1907]:
   srch_id  price  price_position
0        1     30               3
1        1     20               1
2        1     25               2
3        3     15               1
4        3    102               3
5        3     39               2