I have created a custom item similarity that simulates content-based similarity based on a product taxonomy. I have a user who likes only two items:
UserId ItemId Preference
7656361 1449133 1.00
7656361 18886199 8.00
My custom itemSimilarity
returns values from [-1,1] where 1 should mean strong similarity, and -1 strong dissimilarity. The two items the user liked does not have any lowest common ancestors in the taxonomy tree, so they don't have value of 1. But they have values from 0, 0.20 and 0.25 with some items.
I produce recommendation in the following way:
ItemSimilarity similarity = new CustomItemSimilarity(...);
Recommender recommender = new GenericItemBasedRecommender(model, similarity);
List<RecommendedItem> recommendations = recommender.recommend(7656361, 10);
for (RecommendedItem recommendation : recommendations) {
System.out.println(recommendation);
}
I am getting the following result:
RecommendedItem[item:899604, value:4.5]
RecommendedItem[item:1449081, value:4.5]
RecommendedItem[item:1449274, value:4.5]
RecommendedItem[item:1449259, value:4.5]
RecommendedItem[item:715796, value:4.5]
RecommendedItem[item:3255539, value:4.5]
RecommendedItem[item:333440, value:4.5]
RecommendedItem[item:1450204, value:4.5]
RecommendedItem[item:1209464, value:4.5]
RecommendedItem[item:1448829, value:4.5]
Which at first glance someone will say, ok it produce recommendations. I tried to print the values from the itemSimilarity
as it does the comparison between pairwise items, and I got this supprising result:
ItemID1 ItemID2 Similarity
899604 1449133 -1.0
899604 18886199 -1.0
1449081 1449133 -1.0
1449081 18886199 -1.0
1449274 1449133 -1.0
1449274 18886199 -1.0
1449259 1449133 -1.0
1449259 18886199 -1.0
715796 1449133 -1.0
715796 18886199 -1.0
3255539 1449133 -1.0
3255539 18886199 -1.0
333440 1449133 -1.0
333440 18886199 -1.0
1450204 1449133 -1.0
1450204 18886199 -1.0
1209464 1449133 -1.0
1209464 18886199 -1.0
1448829 1449133 -1.0
1448829 18886199 -1.0
228964 1449133 -1.0
228964 18886199 0.25
57648 1449133 -1.0
57648 18886199 0.0
899573 1449133 -1.0
899573 18886199 0.2
950062 1449133 -1.0
950062 18886199 0.25
5554642 1449133 -1.0
5554642 18886199 0.0
...
and there are few more. They are not in the produce order. I just wanted to make a point. All the items that have very strong dissimilarity of -1 are recommended, and those that have some similarity of 0.0, 0.2 and 0.25 are not recommended at all. How is this possible?
The itemSimilarity
method of the interface ItemSimilarity
have the following explenation:
Implementations of this interface define a notion of similarity between two items. Implementations should return values in the range -1.0 to 1.0, with 1.0 representing perfect similarity.
If I use similarity between [0,1] I get the following recommendations:
RecommendedItem[item:228964, value:8.0]
RecommendedItem[item:899573, value:8.0]
RecommendedItem[item:950062, value:8.0]
And pairwise similarity is as follows (only for those tree, for the others is 0):
228964 1449133 0.0
228964 18886199 0.25
950062 1449133 0.0
950062 18886199 0.25
228964 1449133 0.0
228964 18886199 0.25
EDIT: I also printed out the most similar items to 1449133, 18886199
with: (GenericItemBasedRecommender)delegate).mostSimilarItems(new long[]{1449133, 18886199}, 10)
and I got: [RecommendedItem[item:228964, value:0.125], RecommendedItem[item:950062, value:0.125], RecommendedItem[item:899573, value:0.1]]
Only for item 18886199, (GenericItemBasedRecommender)delegate).mostSimilarItems(new long[]{18886199}, 10)
I got [RecommendedItem[item:228964, value:0.25]]
. For 1449133
only there are no similar items.
I don't understand why it does not work with strong dissimilarity?
Another question is why all the predicted preference values are 8.0
or 4.5
. I can see that only the item 18886199
is similar with the the recommended items, but is there a way to multiply the value of 8.0 with the similarity in the case 0.25
, and get value of 2.0
instead of 8.0
. This I can't do while computing the similarity because I don't know the user yet, but I think it should be done during the recommendation phase. Isn't this how the recommender should work or maybe I should create a custom recommender and do the job in a custom way?
I would really appreciate if someone from the Mahout community can give me directions.