How to assign more weight to bigram and trigram?

882 views Asked by At

I have to match the title of two research papers by using n-gram (uni, bi and tri only) I have been asked by my supervisor that while matching i have to assign more weight to bigram matched terms score than unigram matched terms score and more weight to trigram matched terms score than bigram matched terms score. For example, two bigrams are matched in title then the score=2 and two tigrams are matched then the score=2 I have to look for some values and then multiply it to the scores that will increase trigram score and decrease bigram score I looked for research papers related to this problem but i couldn't get any help from there. :(

Can anyone give some idea or some link to the document which may solve the issue??

1

There are 1 answers

0
Alikbar On

in interpolation, we always mix the probability estimates from all the N-gram estimators, weighing and combining the trigram, bigram, and unigram counts. In simple linear interpolation, we combine different order N-grams by linearly interpolating all the models. Thus, we estimate the trigram probability P(wn|wn−2wn−1) by mixing together the unigram, bigram, and trigram probabilities, each weighted by a λ:

Linear interpolation formula

such that the λs sum to 1: