how does the BertModel know to skip attention_mask argument when applied to a single sentence?

333 views Asked by At

I am creating a class which can generate sentence embedding for both a single sentence and a list of sentences using pretrained BertModel. From a sample code, I see the statement

outputs = self.model(tokens_tensor, segments_tensors)

which is without the attention_mask argument. Yet it produces the same result if I do input the attention mask tensor argument

outputs = self.model(tokens_tensor, attention_tensors, segments_tensors)

When running the code for an entire dataset, then the attention_tensors is absolutely needed.

I understand the reason for not needing attention mask for a single sentence, but how does the python code know the second argument is actually segments_tensor, since in the document, it is expecting attention_tensors to be the second argument.

https://huggingface.co/transformers/model_doc/bert.html

1

There are 1 answers

1
Jindřich On

If the attention_mask is not set (and is thus None), is explicitly set to ones everywhere.

See l. 803 in modeling_bert.py.