openNMT translate commands yields garbage results

Question

openNMT translate commands yields garbage results

275 views Asked by girishnjha At 01 September 2020 at 08:31

I am running the following command

onmt_translate  -model demo-model_step_100000.pt -src data/src-test.txt -output pred.txt -replace_unk -verbose

The results in the file 'pred.txt' is something completely different than the source sentences given for translation

The corpus size was 3000 parallel sentences. The preprocess command was -

onmt_preprocess -train_src EMMT/01engParallel_onmt.txt -train_tgt EMMT/01maiParallel_onmt.txt -valid_src EMMT/01engValidation_onmt.txt -valid_tgt EMMT/01maiValidation_onmt.txt -save_data EMMT/demo

training was on the demo model

onmt_train -data EMMT/demo -save_model demo-model

Original Q&A

There are 1 answers

**Wiktor Stribiżew** · Answer 1 · 2020-09-01T09:49:22+00:00

You cannot get decent translations even on "seen" data because:

Your model got trained on too few sentence pairs (3000 is really too, too few to train a good model). You can only get some more or less meanignful translations with corpora of 4M+ (and the more the better).
onmt_train -data EMMT/demo -save_model demo-model trains a small (2 layers x 500 neurons) unidirectional RNN model (see documentation). The transformer model type is recommended to obtain state-of-the-art results.

The FAQ says this about how to run a transformer model training:

The transformer model is very sensitive to hyperparameters. To run it effectively you need to set a bunch of different options that mimic the Google setup. We have confirmed the following command can replicate their WMT results.

python  train.py -data /tmp/de2/data -save_model /tmp/extra \
        -layers 6 -rnn_size 512 -word_vec_size 512 -transformer_ff 2048 -heads 8  \
        -encoder_type transformer -decoder_type transformer -position_encoding \
        -train_steps 200000  -max_generator_batches 2 -dropout 0.1 \
        -batch_size 4096 -batch_type tokens -normalization tokens  -accum_count 2 \
        -optim adam -adam_beta2 0.998 -decay_method noam -warmup_steps 8000 -learning_rate 2 \
        -max_grad_norm 0 -param_init 0  -param_init_glorot \
        -label_smoothing 0.1 -valid_steps 10000 -save_checkpoint_steps 10000 \
        -world_size 4 -gpu_ranks 0 1 2 3

Here are what each of the parameters <mean:

param_init_glorot -param_init 0: correct initialization of parameters

position_encoding: add sinusoidal position encoding to each embedding

optim adam, decay_method noam, warmup_steps 8000: use special learning rate.

batch_type tokens, normalization tokens, accum_count 4: batch and normalize based on number of tokens and not sentences. Compute gradients based on four batches.

label_smoothing 0.1: use label smoothing loss.

TechQA.

openNMT translate commands yields garbage results

There are 1 answers

Related Questions in PYTHON

Related Questions in OPENNMT

Popular Questions

Popular Tags

Trending Questions