masked softmax in theano

656 views Asked by At

I am wondering if it possible to apply a mask before performing theano.tensor.nnet.softmax?

This is the behavior I am looking for:

>>>a = np.array([[1,2,3,4]])
>>>m = np.array([[1,0,1,0]]) # ignore index 1 and 3
>>>theano.tensor.nnet.softmax(a,m)
array([[ 0.11920292,  0. ,  0.88079708,  0.  ]])

Note that a and m are matrices, so I would like the softmax with work on an entire matrix and perform row-wise masked softmax.

Also the output should be the same shape as a, so the solution can not do advanced indexing e.g. theano.tensor.softmax(a[0,[0,2]])

2

There are 2 answers

0
H.W On BEST ANSWER
def masked_softmax(a, m, axis):
    e_a = T.exp(a)
    masked_e = e_a * m
    sum_masked_e = T.sum(masked_e, axis, keepdims=True)
    return masked_e / sum_masked_e
0
A.D On

theano.tensor.switch is one way to do this.

In the computational graph you can do the following:

a_mask = theano.tensor.switch(m, a, np.NINF)
sm = theano.tensor.softmax(a_mask)

hope it helps others.