Convert ngrams count files into ARPA format

938 views Asked by At

I want to convert all my n-grams files into one ARPA file. It will be used as a Language Model for speech recognition.

I have different n-grams files, 2-grams, 3-grams and 4-grams. Taking 2-grams file as example

two grams -- frequency similar degree 32 Writing writes 1 towars their 3 country feature 1 like gold 446 like golf 64

How can I achieve this?

1

There are 1 answers

0
Nikolay Shmyrev On

In srilm package the command to convert counts to arpa is:

  ngram-count -read file.counts -lm file.lm

When doing that you need just maximum order count file, 2-grams are not needed because lower order counts are recomputed from high-order counts.

Here you can find detailed documentation for ngram-count.