While trying to learn fairseq, I was following the tutorials on the website and implementing:
https://fairseq.readthedocs.io/en/latest/tutorial_simple_lstm.html#training-the-model
However, after following all the steps, when I try to train the model using the following:
! fairseq-train data-bin/iwslt14.tokenized.de-en \ --arch tutorial_simple_lstm \ --encoder-dropout 0.2 --decoder-dropout 0.2 \ --optimizer adam --lr 0.005 --lr-shrink 0.5 \ --max-tokens 12000
I receive an error:
`fairseq-train: error: argument --arch/-a: invalid choice: 'tutorial_simple_lstm' (choose from 'fconv', 'fconv_iwslt_de_en', 'fconv_wmt_en_ro', 'fconv_wmt_en_de', 'fconv_wmt_en_fr', 'fconv_lm', 'fconv_lm_dauphin_wikitext103', 'fconv_lm_dauphin_gbw', 'transformer', 'transformer_iwslt_de_en', 'transformer_wmt_en_de', 'transformer_vaswani_wmt_en_de_big', 'transformer_vaswani_wmt_en_fr_big', 'transformer_wmt_en_de_big', 'transformer_wmt_en_de_big_t2t', 'bart_large', 'bart_base', 'mbart_large', 'mbart_base', 'mbart_base_wmt20', 'nonautoregressive_transformer', 'nonautoregressive_transformer_wmt_en_de', 'nacrf_transformer', 'iterative_nonautoregressive_transformer', 'iterative_nonautoregressive_transformer_wmt_en_de', 'cmlm_transformer', 'cmlm_transformer_wmt_en_de', 'levenshtein_transformer', 'levenshtein_transformer_wmt_en_de', 'levenshtein_transformer_vaswani_wmt_en_de_big',....
Some additional info: I am using google colab. And I am writing the entire code until train step into .py file and uploading it to fairseq/models/... path as per my interpretation of the instructions. I am following the exact tutorial in the link. And, before running it on colab, I am installing fairseq using:
!git clone https://github.com/pytorch/fairseq %cd fairseq !pip install --editable ./
I think this error happens because the command line argument created as per the tutorial has not been set properly.
Can anyone please explain if on any step I would need to do something else.
I would be grateful for your inputs as for a beginner learner such help from the community goes a long way.
Seems you didn't register the SimpleLSTMModel architecture as follow. Once the model is registered you can use it with the existing Command-line Tools.
Please note that copying .py files doesn't mean you have registered the model. To do so, you need to execute the .py file that includes abovementioned lines of code. Then, you'll be able to run the training process using existing command-line tools.