Pandas date ranges and averaging the counts

59 views Asked by At

I have the below pandas dataframe

stdate      enddate  count  
2004-01-04  2004-01-10  68  
2004-01-11  2004-01-17  100   
2004-01-18  2004-01-24  83  
2004-01-25  2004-01-31  56    
2004-02-01  2004-02-07  56  
2004-02-08  2004-02-14  68    
2004-02-15  2004-02-21  81  
2004-02-22  2004-02-28  68    
2004-02-29  2004-03-06  76

I want to take an average of the count based on the month:

that is I wanted it like:

date    count
2004-01 (306/25-4)
2004-02 (349/28-01)

for example the second month as the enddate 3, (I need help in aggregarting this counts using pandas)

1

There are 1 answers

0
Ami Tavory On BEST ANSWER

It's not that complicated, but there is a bit of work, and I think you should ditch pandas for most of the calculation, and build a dataframe right at the end.

Suppose you have two datetime objects, b and e. The difference between them in days is

(e - b).days

This gives you how the count of a row is divided by days.

Also, given a month, you can find the last day of the month using the calendar module.

So, you could do the following:

counts_per_month = {}
def process_row(b, e, count):
    ...
    # Find how count splits between the months, 
    #    update counts_per_month accordingly

Now call

df.apply(lambda r: process_row(r.stdate, r.enddate, r.count), axis=1)

at which point counts_per_month will contain your data. Finish off by calling pd.DataFrame.from_dict.