GPT-2 taking into account output logits in forward call?

Question

GPT-2 taking into account output logits in forward call?

60 views Asked by Raj At 04 January 2024 at 08:49

I'm using Huggingface GPT-2 model, specifically GPT2LMHeadModel. I have 2 versions of this model, one loaded normally, and one that is the same, but I change some of the output embeddings (model.lm_head). I'm inputting a batch of sentences using the following in eval mode:

outputs = model(input_ids=test_input_ids, attention_mask=test_attention_mask)

Now from my understanding, for each input id, it only looks at the inputs to the left of it. Since I've given it multiple input ids, and this is a forward call (not generate), it should NOT be using the logits generated as input, correct?

However, when I input the same input_ids into both models, they give me different hidden states, even though they should be the same. What am I missing here?

Original Q&A

There are 1 answers

**Raj** · Answer 1 · 2024-01-04T21:25:10+00:00

Raj On 04 January 2024 at 21:25

I found the issue. It turns out the input embedding layer and the lm_head are the same, so changing lm_head also changes the embedding layer, which is what was changing results.

TechQA.

GPT-2 taking into account output logits in forward call?

There are 1 answers

Related Questions in HUGGINGFACE-TRANSFORMERS

Related Questions in HUGGINGFACE

Related Questions in GPT-2

Popular Questions

Trending Questions