How to mask [PAD] and [SEP] tokens to prevent their prediction and loss calculation for NER task on BERT models?

Question

How to mask [PAD] and [SEP] tokens to prevent their prediction and loss calculation for NER task on BERT models?

951 views Asked by Mani Rai At 29 April 2022 at 11:56

I am trying to fine-tune BERT model for NER tagging task using tensorflow official nlp toolkit. I found there's already a bert token classifier class which i wanted to use. Looking at the code inside, I don't see any masking to prevent tag prediction and loss calculation for paddings and [SEP] token. I think the prevention is possible, just I don't know how? I wanted to prevent this for faster training and also one of the blog mentioned some weird behaviour when not masked.

Anybody has any idea about this?

Original Q&A

There are 1 answers

**Shane Feng** · Answer 1 · 2022-05-19T06:34:56+00:00

Shane Feng On 19 May 2022 at 06:34

Have you found a solution? I'm doing the same task and I found the PADDING TOKEN is dominating the prediction. Passing in an attention mask didn't do anything so I manually chopped down the sequences to just 100 tokens long, and it improves.

TechQA.

How to mask [PAD] and [SEP] tokens to prevent their prediction and loss calculation for NER task on BERT models?

There are 1 answers

Related Questions in TENSORFLOW

Related Questions in BERT-LANGUAGE-MODEL

Related Questions in NAMED-ENTITY-RECOGNITION

Related Questions in TENSORFLOW-MODEL-GARDEN

Popular Questions

Popular Tags

Trending Questions