I want to add a BERT NER module atop openai-whisper. To train it I feed the BERT model tokens (and not text) from whisper decoder. The output should be the entity tokens in a one hot encoding. Now the issue is BERT and openai-whisper tokenizers are different. So when I feed the tokens into BERT NER, it means something different from what it meant originally. Can this be done or is it not possible as BERT is a causal (text to text) LM?
train input: [50364, 286, 362, 257, 3440, 4153, 2446, 412, 1266, 335, 13, 2555, 4160, 385, 13, 50614]
train label [0. 0. 0. 0. 0. 1. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0.]