I have a dataframe and would like to resample it and aggregate over average_period=14 days. The difficult part (for me) is that I would like to have my aggregation bins as starting from today, so [today, today-14], [today-14, today-28], [today-28, today-42], etc. Today's date is always in df but the dates before are not necessarily always there.
If I do the below I get the maximum date 2024-01-23 but it should be 2024-01-13. How can I achieve this?
df = pd.DataFrame({'date_time':['2023-09-19', '2023-09-29', '2023-11-10', '2024-01-13'],
'col1':['0.100', '0.100', '0.100', '0.100'],
'col2':['r', 'r', 'r', 'r'],
'tot':[900, 800, 300, 400],
'hit':[24, 56, 26, 40],
'percent':[33, 23, 33, 31]})
df = df.assign(date_time=pd.to_datetime(df.date_time))
average_period = 14
(df
.set_index('date_time')
.groupby(['col1', 'col2']).resample(f'{average_period}D',
closed='right',
label='right').agg({'hit':'sum',
'tot':'sum',
'percent':'mean'})
.reset_index())