I have the following queries
- Dataset format (is how to split train, test and valid data )
- Where to place the dataset
- How to change the path for dataset reader
- How to save the model in my own directory
- And How to use the trained model
Edit
my_config['dataset_reader']['data_path'] = '/home/ec2-user/SageMaker/squad/data/'
my_config['metadata']['variables']['MODELS_PATH'] = '/home/ec2-user/SageMaker/squad/model/'
I used this command to change my dataset path and model path in configuration file. My model is saved in this location but It is not using my dataset during training instead of this it is downloading its own dataset in that folder and using it.
2-3. The dataset should be placed in the folder https://github.com/deepmipt/DeepPavlov/blob/f5117cd9ad1e64f6c2d970ecaa42fc09ccb23144/deeppavlov/configs/squad/squad_torch_bert.json#L4 (you can change the folder name)
Model is saved in the directory https://github.com/deepmipt/DeepPavlov/blob/f5117cd9ad1e64f6c2d970ecaa42fc09ccb23144/deeppavlov/configs/squad/squad_torch_bert.json#L166 (here you can write your own directory)
Trained model can be used with the command: python3 -m deeppavlov interact <your_config_name> More detailed tutorial how to launch models is here https://github.com/deepmipt/DeepPavlov