I'm having a little problem with KNeighborsClassifier
from sklearn.neighbors
I have a huge file of ratings for movies, where each line represents a user and each column a movie.
I want to suggest a movie(he hasn't watched yet) to a user based on the movies he has rated and rating of other users.
I tried that with:
model = KNeighborsClassifier(n_neighbors=3)
model.fit(user_rated, others_rated)
suggestList = model.predict_proba(others_unrated)
user_rated is list of (float) ratings others_rated is 2d list with the same movie rating user has rated, but by different users others_unrated is 2d list with movie ratings by other users that current user hasn't watched yet
I think the problem is, because others_rated is 2D list, but if i compare it to only one other user(use others_rated[user_num]
) I'll accomplish nothing.
With model.predict_proba(others_unrated)
I get the same error if insert for just for one or many users, Incompatible dimension for X and Y matrices
.
Any suggestions?
I am unsure of what you hope to accomplish, but let me infer a few things.
From these statements, and without access to your data files/arrays, I would guess this is the correct thing for what you are trying to do:
The two changes I have made are as follows: First, I am nearly certain you must have X and y swapped around in your call to
.fit()
. If you don't, your problem is so badly posed (mathematically) it is almost certain to fail: you are trying to train a model to predict a matrix from a vector (predict lots of information from not very much information).Second, the way you have posed the problem, n_users should be the column dimension. This is the only thing that makes sense mathematically. The number of columns X when calling
KNeighborsClassifier.predict_proba(X)
must be the same as the number of columns in X in the previous call toKNeighborsClassifier.fit(X,y)
.