Have I implemented self-attention correctly in Pytorch?

Question

Have I implemented self-attention correctly in Pytorch?

527 views Asked by Henry Gordon At 25 December 2022 at 12:32

This is my attempt at implementing self-attention using PyTorch. Have I done anything wrong, or could it be improved somehow?

class SelfAttention(nn.Module):
    def __init__(self, embedding_dim):
        super(SelfAttention, self).__init__()

        self.keys = nn.Linear(embedding_dim, embedding_dim)
        self.queries = nn.Linear(embedding_dim, embedding_dim)
        self.values = nn.Linear(embedding_dim, embedding_dim)

    
    def forward(self, x):
        keys = self.keys(x)
        queries = self.queries(x)
        values = self.values(x)
        
        scores_prime = torch.matmul(queries.T, keys)
        scores = nn.functional.softmax(scores_prime)

        context_vectors = torch.matmul(values, scores)

        return context_vectors

My test vector ran through without error, but I can't be sure I didn't make a mistake.

Original Q&A

There are 1 answers

**Shai** · Answer 1 · 2022-12-25T12:38:33+00:00

Shai On 25 December 2022 at 12:38

To better test your implementation, I suggest you use a different dimension for the queries and keys. I think you replaced the roles of queries and keys.

TechQA.

Have I implemented self-attention correctly in Pytorch?

There are 1 answers

Related Questions in PYTORCH

Related Questions in NLP

Related Questions in ATTENTION-MODEL

Related Questions in SELF-ATTENTION

Popular Questions

Popular Tags

Trending Questions