Why does Kenlm lm model keep returning the same score for different words?

Question

Why does Kenlm lm model keep returning the same score for different words?

392 views Asked by sourabh gupta At 08 September 2021 at 19:42

Why is the kenlm model returning the same values? I have tried it with a 4-gram arpa file as well. same issue.

import kenlm
model = kenlm.mode('lm/test.arpa') # unigram model. 

print( [f'{x[0]:.2f}, {x[1]}, {x[2]}' for x in model.full_scores('this is a sentence', bos=False, eos=False)])
print( [f'{x[0]:.2f}, {x[1]}, {x[2]}' for x in model.full_scores('this is a sentence1', bos=False, eos=False)])
print( [f'{x[0]:.2f}, {x[1]}, {x[2]}' for x in model.full_scores('this is a devil', bos=False, eos=False)])

Result:

['-2.00, 1, True', '-21.69, 1, False', '-1.59, 1, False', '-2.69, 1, True']

Original Q&A

There are 1 answers

**sourabh gupta** · Answer 1 · 2021-09-09T23:12:24+00:00

sourabh gupta On 09 September 2021 at 23:12

Figured it out by myself.

The True/False in the output tells you whether a word is OOV (out of vocabulary) or not. The KenLM model assigns a fixed probability to these words. In the examples in the questions, all the last words are OOVs.

TechQA.

Why does Kenlm lm model keep returning the same score for different words?

There are 1 answers

Related Questions in LM

Related Questions in KENLM

Popular Questions

Popular Tags

Trending Questions