Bug report nltk.translate.bleu_score stopped working on tokens less than or equal to 3

37 views Asked by At

Plat form Windows 11 Anaconda

import nltk as nltk
nltk.__version__
'3.8.1'

The sentence_bleu ought to return 1 for identical translation

from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
sentence_bleu([["hi", "hello", "world"]], ["hi", "hello", "world"])
1.2213386697554703e-77

and even with a smooth function it could not help much

smoother = SmoothingFunction()
sentence_bleu([["hi", "hello", "world"]], ["hi", "hello", "world"], smoothing_function=smoother.method4)
0.5757197301274735

However,

sentence_bleu([["hi", "hello", "world", "how"]], ["hi", "hello", "world", "how"])
1.0

This appeared to be a bug in case handling or the summation index.

0

There are 0 answers