I have followed the guide given in the link http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html
But this is outdated as it uses spark Mlib RDD approach. The New Spark 2.0 has DataFrame approach. Now My problem is I have got the updated code
val ratings = spark.read.textFile("data/mllib/als/sample_movielens_ratings.txt")
.map(parseRating)
.toDF()
val Array(training, test) = ratings.randomSplit(Array(0.8, 0.2))
// Build the recommendation model using ALS on the training data
val als = new ALS()
.setMaxIter(5)
.setRegParam(0.01)
.setUserCol("userId")
.setItemCol("movieId")
.setRatingCol("rating")
val model = als.fit(training)
// Evaluate the model by computing the RMSE on the test data
val predictions = model.transform(test)
Now Here is the problem, In the old code the model that was obtained was a MatrixFactorizationModel, Now it has its own model(ALSModel)
In MatrixFactorizationModel you could directly do
val recommendations = bestModel.get
.predict(userID)
Which will give the list of products with highest probability of user liking them.
But Now there is no .predict method. Any Idea how to recommend a list of products given a user Id
Use
transform
method on model:There's a jira ticket to implement recommend(User|Product) method, but it's not yet on default branch
Now you have DataFrame with score for user
You can simply use orderBy and limit to show N recommended products:
DataFrame dataFrameToPredict can be some large user-product DataFrame, for example all users x all products