Assume I have a DataFrame like the following:
Month, Gender, State, Value
2010-01, M, S1, 10
2010-02, M, S1, 20
2010-05, M, S1, 26
2010-03, F, S2, 11
I want to add another column for the given Gender and state from the previous month (or X
months past) if it exists, i.e.:
Month, Gender, State, Value, Last Value
2010-01, M, S1, 10, NaN
2010-02, M, S1, 20, 10
2010-05, M, S1, 26, NaN (there is no 2010-04 for M, S1)
2010-03, F, S2, 11, NaN
I know I have to groupby(['Gender', 'State'])
but then shift()
does not work as it just shifts data by number of rows, it is not aware of the period itself (if a month is missing).
I found a way of doing this, not too happy about it tho:
So basically, instead of dealing with missing rows in the data, lets just create the missing rows and the
shift()
works as expected.I.e.: