Generating monthly level data using ffill and bffill on multiplce columns of a log file

70 views Asked by At

I have a log file in following format:

Item Month_end_date old_price new_price row
A 2022-03-31 25 30 1
A 2022-06-30 30 40 2
A 2022-08-31 40 45 3
B 2022-04-30 80 70 4

Here, its assumed that the price of the item A from the start of the year was 25 using 1st row of the above table. I want to get monthly prices using this table. The ideal output looks like the table below:

Item Month_end_date price
A 2022-01-31 25
A 2022-02-28 25
A 2022-03-31 30
A 2022-04-30 30
A 2022-05-31 30
A 2022-06-30 40
A 2022-07-31 40
A 2022-08-31 45
A 2022-09-30 45
A 2022-10-31 45
A 2022-11-30 45
A 2022-12-31 45
B 2022-01-31 80
B 2022-02-28 80
B 2022-03-31 80
B 2022-04-30 70
B 2022-05-31 70
B 2022-06-30 70
B 2022-07-31 70
B 2022-08-31 70
B 2022-09-30 70
B 2022-10-31 70
B 2022-11-30 70
B 2022-12-31 70
1

There are 1 answers

4
mozway On

IIUC, you can reshape, fill in the missing periods and ffill/bfill per group:

(df
 .assign(**{'Month_end_date': pd.to_datetime(df['Month_end_date'])})
 .set_index(['Item', 'Month_end_date'])
 [['old_price', 'new_price']]
 .reindex(pd.MultiIndex
            .from_product([df['Item'].unique(),
                           pd.date_range('2022-01-01',
                                         '2022-12-31',
                                         freq='M')],
                          names=['Items', 'Month_end_date'])
         )
 .stack(dropna=False)
 .groupby(level=0).apply(lambda g: g.ffill().bfill())
 .unstack()['new_price']
 .reset_index(name='price')
)

output:

   Items Month_end_date  price
0      A     2022-01-31   25.0
1      A     2022-02-28   25.0
2      A     2022-03-31   30.0
3      A     2022-04-30   30.0
4      A     2022-05-31   30.0
5      A     2022-06-30   40.0
6      A     2022-07-31   40.0
7      A     2022-08-31   45.0
8      A     2022-09-30   45.0
9      A     2022-10-31   45.0
10     A     2022-11-30   45.0
11     A     2022-12-31   45.0
12     B     2022-01-31   80.0
13     B     2022-02-28   80.0
14     B     2022-03-31   80.0
15     B     2022-04-30   70.0
16     B     2022-05-31   70.0
17     B     2022-06-30   70.0
18     B     2022-07-31   70.0
19     B     2022-08-31   70.0
20     B     2022-09-30   70.0
21     B     2022-10-31   70.0
22     B     2022-11-30   70.0
23     B     2022-12-31   70.0