Item Based Similarity Metric

477 views Asked by At

I am using Mahout Apache to write an item based recommender (based on similar item ratings by users) and I was wondering which of the following two similarity metrics would be the best to use:

Pearson, Spearman, Euclidean, Tanimoto and Loglikelihood

1

There are 1 answers

4
Dragan Milcevski On

If you have preference values you should use Person Correlation or Euclidian distance similarity metrics. If you don't have preference values you should use Tanimoto coeficient or Loglikelihood. To choose which of the narrowed down to use you should perform evaluation on your dataset. That is why the evaluation framework of mahout is used. You can evaluate many metrics, like Mean Square Error (MSE), Absolute Mean Square Error, Precision, Recall, MAP...

I've coded Adjusted Cosine Similarity, variant of Pearson correlation which gives better results, but its slower.