Kenlm lmplz on Google Colab

948 views Asked by At

I used Kenlm to train a language model on Google Colab. This is what i have in bin folder:

%cd /content/drive/My Drive/kenlm/build/bin
!ls

/content/drive/My Drive/kenlm/build/bin
 build_binary     'lm (1).en.arpa'   phrase_table_vocab         tst2012.en
 count_ngrams      lm_data       probing_hash_table_benchmark   tst2012.vi
 filter        lm_data.zip       query              tst2013.en
 fragment      lm.en.arpa        train.en               tst2013.vi
 kenlm_benchmark   lmplz         train.vi

I've in bin folder and I also put my "train.*" file there but when

!lmplz -o 3 <train.en> lm.en.arpa

Colab replied:

/bin/bash: lmplz: command not found"

How can I run it?

1

There are 1 answers

0
JonnyJack On

I know it is too late to answer but it may help anyone goes in here after this.

As kenlm's documentation, the author only state that we should execute command outside bin directory. You can follow my script here.

Note: If anyone struggle compiling kenlm locally, remember to install all dependencies (as listed here) before cmake.