How to use WmdSimilarity function provided in gensim along with word embeddings which are in numpy.ndarray data type

Question

How to use WmdSimilarity function provided in gensim along with word embeddings which are in numpy.ndarray data type

568 views Asked by s.bhardwaj At 12 July 2018 at 17:16

Using Word2vec (skip-gram) model in tensorflow , I wrote the code to obtain word embeddings from document-set. The final embeddings are in numpy.ndarray format

Now to obtain similar documents , I need to use the WMD(Word Movers Distance) algorithm.

(I don't have much knowledge of gensim) The gensim.similarities.WmdSimilarity() requires the embeddings to be in KeyedVectors data type (seems like) -- What can I do to implement WMD in my code.I have a tight deadline and can't give much time to writing the code of WMD from scratch .

Original Q&A

There are 1 answers

**aneesh joshi** · Answer 1 · 2018-07-15T18:03:37+00:00

If you're looking for similarity between 2 words, use

my_gensim_word2vec_model.most_similar('king')

my_gensim_word2vec_model is the gensim model, of course, not your own tensorflow model.

If you want the most similar to a bunch of words:

my_gensim_word2vec_model.most_similar(positive=['king', 'queen', 'rabbit'])

Check the gensim docs

If your're looking for similarity between sentences or documents, you're better off using doc2vec which gives a vector for all the vocabulary words and documents.

Or take the average of all words in the sentence/document to get the vector for that document. Then get the cosine similarity between the averages of the two sentences to be compared.

For example:

Similarity("Hello World", "Hi there") = CosineSimilarity(vec1, vec2)
"Hello World" -> (Vec("Hello") + Vec("World"))/2 -> vec1
"Hi there" -> (Vec("Hi") + Vec("there"))/2 -> vec2

(Your question is unclear. What is document set? What is your task?) Hope this helps.

TechQA.

How to use WmdSimilarity function provided in gensim along with word embeddings which are in numpy.ndarray data type

There are 1 answers

Related Questions in PYTHON-3.6

Related Questions in GENSIM

Related Questions in WORD2VEC

Related Questions in NUMPY-NDARRAY

Related Questions in WMD

Popular Questions

Trending Questions