I want to convert all my n-grams files into one ARPA file. It will be used as a Language Model for speech recognition.
I have different n-grams files, 2-grams, 3-grams and 4-grams. Taking 2-grams file as example
two grams -- frequency similar degree 32 Writing writes 1 towars their 3 country feature 1 like gold 446 like golf 64
How can I achieve this?
In srilm package the command to convert counts to arpa is:
When doing that you need just maximum order count file, 2-grams are not needed because lower order counts are recomputed from high-order counts.
Here you can find detailed documentation for ngram-count.