I am trying to get most probable sequence of word using gensim word2vec model. I have found a pretrained model which provides these files:
word2vec.bin
word2vec.bin.syn0.npy
word2vec.bin.syn1neg.npy
This is my code trying to get the probability of the sentence with this model:
model = model.wv.load(word_embedding_model_path)
model.hs = 1
model.negative = 0
print model.score(sentence.split(" "))
While running this code I am getting this error:
AttributeError: 'Word2Vec' object has no attribute 'syn1'
Can anyone help me figure out how to solve the problem. In general, I want to use some pretrained model to get the probability of sequence of word appearing together.
You can't toggle a model from using negative-sampling (eg
negative=5, hs=0
) to using hierarchical-softmax (eghs=1, negative=0
) after initial setup and training. The two models use different internal properties, that are only created by setup & training. (For example, the propertysyn1
only exists in a model that was created & trained in hierarchical-softmax mode.)Since the
score()
method is currently only functional for HS models, you'd need to only use it with models that were trained in that mode.(Note also that a value from
score()
of a single text, against a single model, isn't interpretable as an absolute probability. It's only in comparison against the scores of other texts against the same model, or the same text against alternate models, that the relative value of the score becomes meaningful.)