FitBERT is an useful package , but I have a small doubt on BERT development for masked word prediction as below: I trained a bert model with custom corpus using Google's Scripts like create_pretraining_data.py
, run_pretraining.py
, extract_features.py
etc..as a result I got vocab file, .tfrecord
file, .json
file and check point files.
Now how to use those file for your package to predict a masked word in a given sentence??
From the tensorflow documentation:
This document along with the tensorflow documentation explain quite well how to use those file types.
While instead to use FitBERT directly through the library you can follow the examples you find on the project's github.