I would like to construct sequences of user's purchasing history using dictionaries in Python. I would like these sequences to be ordred by date.
I have 3 columns in my dataframe:
users items date
1 1 date_1
1 2 date_2
2 1 date_3
2 3 date_1
4 5 date_2
4 1 date_5
4 3 date_3
And the result should be like this :
{1: [[1,date_1],[2,date_2]], 2:[[3,date_1],[5,date_2],[1,date_3]], 4:[[5,date_2],[3,date_3][1,date_5]]}
My code is :
df_sub = df[['uid', 'nid', 'date']]
dic3 = df_sub.set_index('uid').T.to_dict('list')
And my results are :
{36864: [258509L, '2014-12-03'], 548873: [502105L, '2015-09-08'], 42327: [492268L, '2015-01-29'], 548873: [370049L, '2015-02-18'], 36864: [258909L, '2016-01-13'] ... }
But I would like to group by users :
{36864: [[258509L, '2014-12-03'],[258909L, '2016-01-13']], 548873: [[502105L, '2015-09-08'],[370049L, '2015-02-18']], 42327: [492268L, '2015-01-29'] }
Some help, please!
Firstly, set users as the index and perform
groupbyw.r.t that. Then, you could pass a function to sort each group by it's date column and extract it's underlying array part using.values.Use
.tolistto get back it'slistequivalent. This gives you in the required format. Finally, use.to_dictto get your final output as a dictionary.produces: