I would like to construct sequences of user's purchasing history using dictionaries in Python. I would like these sequences to be ordred by date.
I have 3 columns in my dataframe:
users items date
1 1 date_1
1 2 date_2
2 1 date_3
2 3 date_1
4 5 date_2
4 1 date_5
4 3 date_3
And the result should be like this :
{1: [[1,date_1],[2,date_2]], 2:[[3,date_1],[5,date_2],[1,date_3]], 4:[[5,date_2],[3,date_3][1,date_5]]}
My code is :
df_sub = df[['uid', 'nid', 'date']]
dic3 = df_sub.set_index('uid').T.to_dict('list')
And my results are :
{36864: [258509L, '2014-12-03'], 548873: [502105L, '2015-09-08'], 42327: [492268L, '2015-01-29'], 548873: [370049L, '2015-02-18'], 36864: [258909L, '2016-01-13'] ... }
But I would like to group by users :
{36864: [[258509L, '2014-12-03'],[258909L, '2016-01-13']], 548873: [[502105L, '2015-09-08'],[370049L, '2015-02-18']], 42327: [492268L, '2015-01-29'] }
Some help, please!
Firstly, set users as the index and perform
groupby
w.r.t that. Then, you could pass a function to sort each group by it's date column and extract it's underlying array part using.values
.Use
.tolist
to get back it'slist
equivalent. This gives you in the required format. Finally, use.to_dict
to get your final output as a dictionary.produces: