I just discovered that summarizing a data table (I am relatively new to r and data.table) doesn't result in a sorted data tabe, nor does a join:
starting data table:
> DailyDataICount
ID week WeekDay Qty
1: 33739 1746 2 9369
2: 33739 1746 3 9981
3: 33739 1746 4 10109
4: 33739 1746 5 9012
5: 33739 1746 6 10387
6: 33739 1746 7 2
7: 33739 1747 1 7264
8: 33739 1747 2 9195
9: 33739 1747 3 8865
10: 33739 1747 4 8613
11: 33739 1747 5 9154
12: 33739 1747 6 9754
13: 33739 1747 7 11565
14: 33739 1748 1 9210
15: 33739 1748 2 8579
16: 33739 1748 3 8406
17: 33739 1748 4 8161
18: 33739 1748 5 8778
19: 33739 1748 6 9884
20: 33739 1748 7 10365
summarization:
> MeanCount <- DailyDataICount[,mean(Qty),by="ID,WeekDay"]
> setnames(MeanCount,"V1","ObjQty")
> MeanCount
ID WeekDay ObjQty
1: 33739 2 9047.667
2: 33739 3 9084.000
3: 33739 4 8961.000
4: 33739 5 8981.333
5: 33739 6 10008.333
6: 33739 7 7310.667
7: 33739 1 8237.000
join:
> Count_SM_FM
ID WeekDay ItemType QtyType
1: 33739 2 FM 995
2: 33739 2 SM 15241
3: 33739 3 FM 1036
4: 33739 3 SM 15387
5: 33739 4 FM 1002
6: 33739 4 SM 15449
7: 33739 5 FM 990
8: 33739 5 SM 15388
9: 33739 6 FM 1344
10: 33739 6 SM 16016
11: 33739 7 SM 13840
12: 33739 1 FM 1015
13: 33739 1 SM 11456
14: 33739 7 FM 1710
> setkey(Count_SM_FM,ID,WeekDay)
> Count_SM_FM <- Count_SM_FM[MeanCount]
>
> Count_SM_FM
ID WeekDay ItemType QtyType ObjQty
1: 33739 2 FM 995 9047.667
2: 33739 2 SM 15241 9047.667
3: 33739 3 FM 1036 9084.000
4: 33739 3 SM 15387 9084.000
5: 33739 4 FM 1002 8961.000
6: 33739 4 SM 15449 8961.000
7: 33739 5 FM 990 8981.333
8: 33739 5 SM 15388 8981.333
9: 33739 6 FM 1344 10008.333
10: 33739 6 SM 16016 10008.333
11: 33739 7 SM 13840 7310.667
12: 33739 7 FM 1710 7310.667
13: 33739 1 FM 1015 8237.000
14: 33739 1 SM 11456 8237.000
is it because it's more efficient?
I found this out because I had to join Count_SM_FM
with a different data.table but using the same key (which I hadn't set again) and I got an error. I will solve the immediate problem by setting the key again. I know this is a bit of philosophical question, not keeping the order probably usually saves time but if in most cases people need to use the results as sorted...
Thanks!
The order of elements in a join of the form
x[i]
is determined byi
. In your case,i
isMeanCount
andx
isCount_SM_FM
andi
is not sorted.To get the result you seek, you should just do: