How to use multiple heads option in selfAttention class?

52 views Asked by At

I am playing around with Self-attention model from trax library.

when I set n_heads=1, everything works fine. But when I set n_heads=2, my code breaks.

I use only input activations and one SelfAttention layer.

Here is a minimal code:

import trax
import numpy as np

attention = trax.layers.SelfAttention(n_heads=2)

activations = np.random.randint(0, 10, (1, 100, 1)).astype(np.float32)
input = (activations, )

init = attention.init(input)

output = attention(input)

But I have en error:

 File [...]/site-packages/jax/linear_util.py, line 166, in call_wrapped
    ans = self.f(*args, **dict(self.params, **kwargs))

  File [...]/layers/research/efficient_attention.py, line 1637, in forward_unbatched_h
    return forward_unbatched(*i_h, weights=w_h, state=s_h)

  File [...]/layers/research/efficient_attention.py, line 1175, in forward_unbatched
    q_info = kv_info = np.arange(q.shape[-2], dtype=np.int32)

IndexError: tuple index out of range

What I do wrong?

0

There are 0 answers