How to implement supervised class based language model in SRILM?

Question

How to implement supervised class based language model in SRILM?

692 views Asked by Ranjeet Singh At 08 May 2018 at 12:25

I found tutorials where class based LM is implemented using Brown clustering passing just number of classes you want but I want to implement a class based model where I give class assignments initially. I tried this http://projects.csail.mit.edu/cgi-bin/wiki/view/SLS/SriLM. But this gives -99 to all ngrams in LM. There is very less documentation regarding this, Can anyone help me out?

Original Q&A

There are 1 answers

**Aaron** · Accepted Answer · 2018-05-08T21:05:38+00:00

I've done this before but it was several years ago. Let me see if I can retrace the steps for you.

The first step is to create the file that specifies the classes. It should have three columns. First is the class id, then the probability of that word given the class, and lastly the word.

Next step is to replace all the words in the training data with their class ids. You can use the SRILM replace-words-with-classes script or you can write your own script to do it.

Now you train a language model using ngram-count just like you would for a regular non-class n-gram model.

For evaluation you just specify the language model and also the class file.

ngram -ppl test_data.txt -lm class.lm -classes class_definition_file.txt

TechQA.

How to implement supervised class based language model in SRILM?

There are 1 answers

Related Questions in NLP

Related Questions in SPEECH-RECOGNITION

Related Questions in SRILM

Popular Questions

Popular Tags

Trending Questions