How to make item based collaborative filtering run faster?

Question

How to make item based collaborative filtering run faster?

270 views Asked by Sashank At 26 December 2016 at 06:08

I am trying to find similarity between each pair of items. Items are in a python dictionary and I find the similarity taking pair at a time. The code is -

def allSimilarity(itemsDict, similarityMetric):
    itemList = itemsDict.keys()
    itemSimilarityDict = {}
    for item1 in itemList:
        itemSimilarityDict[item1] = {}
        for item2 in itemList:
            if(item1 == item2):
                continue
            itemSimilarityDict[item1][item2] = similarityMetric(itemsDict, item1, item2)
    return itemSimilarityDict

The problem is that outer loop is taking 5 seconds for each item. I have ~300,000 items so it takes ~18 days for the whole computation. Is there any way to increase the speed? Can I use packages like Theano, Tensorflow and use GPU for this? Or can take a cloud and parallelize the process?

Original Q&A

There are 1 answers

**Yao Zhang** · Accepted Answer · 2016-12-27T11:06:16+00:00

I don't think a machine learning library would be particularly helpful here if there is no operations or building blocks readily available for this type of all to all similarity comparison.

I think you'd have better luck by looking at more generic parallelization solutions: OpenMP, TBB, MapReduce, AVX, CUDA, MPI, map reduce, etc.

Also, rewriting the same code in C++ will surely speed things up.

TechQA.

How to make item based collaborative filtering run faster?

There are 1 answers

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in THEANO

Related Questions in COLLABORATIVE-FILTERING

Popular Questions

Popular Tags

Trending Questions