How to speedup mkcls step in mgiza++ or giza++, it taking up lots of time for word clustering?

249 views Asked by BranY At 22 December 2016 at 01:46

I am using the MGIZA++ for aligning word from the bitexts from United Nations Parallel Corpus。

Before training the alignment model using MGIZA++, I need to use the mkcls script to make classes that is necessary for Hidden Markov Model algorithm as such:

mkcls -c50 -n10 -ptest.en -Vtest.en.vcb.classes

i'm trying it on corpus with 1,000,000 lines, but is takes a long time and still can't get result (when I try a small dataset, it works).

Is there a multi-threaded or parallel toolkit to do mkcls?

Original Q&A

TechQA.

How to speedup mkcls step in mgiza++ or giza++, it taking up lots of time for word clustering?

There are 0 answers

Related Questions in MACHINE-TRANSLATION

Related Questions in GIZA++

Related Questions in SMT-LANGUAGE-PROCESSING

Popular Questions

Popular Tags

Trending Questions