Iterate over dates in a Pandas Dataframe to get the count of a different column per week

706 views Asked by At

I am a java developer finding it a bit tricky to switch to python and Pandas. Im trying to iterate over dates of a Pandas Dataframe which looks like below,

    sender_user_id    created
0   1                 2016-12-19 07:36:07.816676
1   33                2016-12-19 07:56:07.816676
2   1                 2016-12-19 08:14:07.816676
3   15                2016-12-19 08:34:07.816676

what I am trying to get is a dataframe which gives me a count of the total number of transactions that have occurred per week. From the forums I have been able to get syntax for 'for loops' which iterate over indexes only. Basically I need a result dataframe which looks like this. The value field contains the count of sender_user_id and the date needs to be modified to show the starting date per week.

    date          value
0   2016-12-09    20
1   2016-12-16    36
2   2016-12-23    56
3   2016-12-30    32

Thanks in advance for the help.

1

There are 1 answers

4
jezrael On BEST ANSWER

I think you need resample by week and aggregate size:

#cast to datetime if necessary
df.created = pd.to_datetime(df.created)
print (df.resample('W', on='created').size().reset_index(name='value'))
     created  value
0 2016-12-25      4

If need another offsets:

df.created = pd.to_datetime(df.created)
print (df.resample('W-FRI', on='created').size().reset_index(name='value'))
     created  value
0 2016-12-23      4

If need number of unique values per week aggregate by nunique:

df.created = pd.to_datetime(df.created)
print (df.resample('W-FRI', on='created')['sender_user_id'].nunique()
         .reset_index(name='value'))
     created  value
0 2016-12-23      3