Understanding Spark MLlib ALS.trainImplicit input format

405 views Asked by At

I`m trying to make a recommender system based on purchase history using trainImplicit. My input is in domain [1, +inf) (the sum of views and purchases).

So the element of my input RDD looks like this: [(user_id,item_id),rating] --> [(123,5564),6] - the user(id = 123) interacted with the item(id=5564) 6 times.

Should I add to my RDD elements such as [(user_id,item_id),rating] --> [(123,2222),0], meaning that given user has never interacted with given item or the ALS.implicitTrain does this implicitly?

1

There are 1 answers

2
user7337271 On BEST ANSWER

It it not necessary (for implicit) and shouldn't be done (for explicit) so in this case bass only data you actually have.