I have the following dataframe
df2 = pd.read_csv('arangerr2.csv')
hgroup lowgroup value max_hg_value
0 A B 0 39
1 A B 18 39
2 A B 38 39
3 A C 0 39
4 A C 19 39
5 A C 39 39
I want to calculate the median for every value of 20 grouped by lowgroup using np.arange and pd.cut
start = 0
stop = df2['value'].max()
step = 20
bins = np.arange(start,stop + 10, step)
df2['bins'] = pd.cut(df['value']+1 , bins)
df2['mean'] = df2['value'].groupby(pd.cut(df2['value'], bins=bins,
right=False)).transform('mean')
df2
hgroup lowgroup value max_hg_value bins mean
0 A B 0 39 (0, 20] 9.25
1 A B 18 39 (0, 20] 9.25
2 A B 38 39 (20, 40] 38.50
3 A C 0 39 (0, 20] 9.25
4 A C 19 39 (0, 20] 9.25
5 A C 39 39 (20, 40] 38.50
This seems to do the job perfectly. However, it seems to only work when the stop value is only one value or fixed. How do we solve this if we have multiple hgroups with different low groups and different max values.
What do we need to do to go from this
Hgroup LowGoup Value Max_HG_value
0 A B 0 39
1 A B 18 39
2 A B 38 39
3 A C 0 39
4 A C 19 39
5 A C 39 39
6 B D 0 50
7 B D 17 50
8 B D 34 50
9 B D 55 50
10 B E 0 50
11 B E 14 50
12 B E 22 50
13 B E 50 50
14 C F 0 69
15 C F 10 69
16 C F 25 69
17 C F 50 69
18 C F 65 69
19 C G 0 69
20 C G 9 69
21 C G 30 69
22 C G 48 69
23 C G 69 69
to this
Hgroup LowGoup Value Max_HG_value Mean
0 A B 0 39 9.25
1 A B 18 39 9.25
2 A B 38 39 38.50
3 A C 0 39 9.25
4 A C 19 39 9.25
5 A C 39 39 38.50
6 B D 0 50 7.75
7 B D 17 50 7.75
8 B D 34 50 28.00
9 B D 55 50 52.50
10 B E 0 50 7.75
11 B E 14 50 7.75
12 B E 22 50 28.00
13 B E 50 50 52.50
14 C F 0 69 4.75
15 C F 10 69 4.75
16 C F 25 69 27.50
17 C F 50 69 49.00
18 C F 65 69 67.00
19 C G 0 69 4.75
20 C G 9 69 4.75
21 C G 30 69 27.50
22 C G 48 69 49.00
23 C G 69 69 67.00
It seems like we need to apply np.arange and pd.cut for every single lowgroup within hgroup. I have tried multiple ways but i cant seem to get it right. Can someone help me
You don't need
pd.cutas your interval is evenly spaced:For your second example, it seems you need to group by
Hgrouptoo: