So at the moment I am trying to figure out how to build a movie recommender system from MovieLense (https://grouplens.org/datasets/movielens/100k/). I read some instructions from a tutorial.
library(dplyr)
library(recommenderlab)
library(magrittr)
data <- read.table("u.data", header = F, stringsAsFactors = T)
head(data)
V1 V2 V3 V4
1 196 242 3 881250949
2 186 302 3 891717742
3 22 377 1 878887116
4 244 51 2 880606923
5 166 346 1 886397596
6 298 474 4 884182806
Explanation: V1
is userid, V2
is itemid, V3
is rating
Now I need to log format to ratingMatrix, and the result will be like this:
1 2 3 4 5 6 7 8 9 10
1 5 3 4 3 3 5 4 1 5 3
2 4 NA NA NA NA NA NA NA NA 2
3 NA NA NA NA NA NA NA NA NA NA
4 NA NA NA NA NA NA NA NA NA NA
5 4 3 NA NA NA NA NA NA NA NA
6 4 NA NA NA NA NA 2 4 4 NA
7 NA NA NA 5 NA NA 5 5 5 4
8 NA NA NA NA NA NA 3 NA NA NA
9 NA NA NA NA NA 5 4 NA NA NA
10 4 NA NA 4 NA NA 4 NA 4 NA
code:
temp = data %>% select(1:3) %>% spread(V2,V3) %>% select(-1)
temp[1:10,1:10]
Error in spread(., V2, V3) : could not find function "spread"
Try replacing
library(dplyr)
withlibrary(tidyverse)
. Thespread
function now lives in thetidyr
package which is part of thetidyverse
along withdplyr
.