GIZA++ - How is alignment score calculated?

933 views Asked by At

This might be more of a math problem, but I couldn't find any relevant document elsewhere.

I just want to figure out which equation is used to calculate alignment score in GIZA++.

Might anyone have an idea?

Thank you for your help in advance.

2

There are 2 answers

2
Roger Rowland On BEST ANSWER

If it helps, I found this document, which includes the following description:

Implements full IBM-4 alignment model with a dependency of word classes as described in (Brown et al. 1993)

Following up that reference leads to a paper entitled "The Mathematics of Statistical Machine Translation: Parameter Estimation", which you can find in PDF format here.

The paper gives details of the math underlying the 5 alignment models and is too verbose to paste here. Perhaps you can see if this is sufficiently detailed in its description of Model 4, which is what I assume is used by GIZA++.

There is also this PDF, which summarises the models and training process.

0
Jokester On

In short, word alignments and translation probabilities are learned in multiple iterations of Expectation Maximum algorithm.

The "Statistical Machine Translation" of Philip Koehn has a chapter for word alignments. Check statmt.org for more information.