I want to sort values by id_
, Code
, Timestamp
(since time order matters), then groupby d1
using id_
and Code
, then forward fill using ffill
for NaN
for each group, on columns V1
and V2
only, while keeping other columns constant, and return the full table.
d1
:
Type_x id_ Timestamp V1 Code Type_y V2
0 abcd 39-38-30-34 2012-09-20 23:46:05.870 35.5 2 NaN 0
1 abcd 39-38-30-34 2012-09-20 23:46:23.870 44.5 0 NaN 1
2 abcd 39-38-30-34 2012-09-20 23:48:07.870 43.5 0 NaN 1
3 abcd 39-38-30-34 2012-09-20 23:49:48.870 42.5 0 NaN NaN
4 abcd 39-38-30-34 2012-09-20 23:50:44.870 34.5 2 NaN NaN
Tried:
d2 = d1.sort_values(by = ['id_', 'Code', 'Timestamp']).groupby(['id_', 'Code'])['V1', 'V2'].ffill()
which only returned two columns:
V1 V2
69659 21.5 NaN
300886 21.5 1.0
300887 21.5 0.0
70086 23.0 0.0
300955 23.0 1.0
How should I do it properly?
You can use
transform
and do it column by column if you have other columns in your actual dataframe other than the columns you want togroupby
and the columns you want toffill
: