Pandas - Get average value by day and month

29 views Asked by At

I have a Pandas dataframe in this format:

    record_date val
0   2019-01-01  0.417022
1   2019-02-01  0.720324
2   2019-03-01  0.000114
3   2019-04-01  0.302333
4   2019-05-01  0.146756
5   2019-06-01  0.092339
6   2019-07-01  0.186260
7   2019-08-01  0.345561
8   2019-09-01  0.396767
9   2019-10-01  0.538817

How do I add a new column daily_average_val which contains the average value of the val column over all years for the month and day combination?

for 1-1-2024, I want the average of all January 1sts in the dataframe
for 1-1-2025, I want the average of all January 1sts in the dataframe (same as 1-1-2019,1-1-2020,1-1-2021)
for 2-1-2025, I want the average of all February 1sts in the dataframe

1

There are 1 answers

0
mozway On

Set up a custom grouper and groupby.transform:

df['daily_average_val'] = (df.groupby(pd.to_datetime(df['record_date'])
                                        .dt.strftime('%m-%d'))
                             ['val'].transform('mean')
                          )

Or, if you have a string in the format YYYY-MM-DD:

df['daily_average_val'] = (df.groupby(df['record_date'].str[4:])
                             ['val'].transform('mean')
                          )

Output:

  record_date       val  daily_average_val
0  2019-01-01  0.417022           0.417022
1  2019-02-01  0.720324           0.720324
2  2019-03-01  0.000114           0.000114
3  2019-04-01  0.302333           0.302333
4  2019-05-01  0.146756           0.146756
5  2019-06-01  0.092339           0.092339
6  2019-07-01  0.186260           0.186260
7  2019-08-01  0.345561           0.345561
8  2019-09-01  0.396767           0.396767
9  2019-10-01  0.538817           0.538817

NB. since you only have one year, the daily_average_val output is identical to val.