suppose I have two arrays:
import numpy as np
a=np.array([[1,2],
[3,4]])
b=np.array([[1,2],
[3,4]])
and I want to element-wise multiply the arrays then sum the elements, i.e. 1*1 + 2*2 + 3*3 + 4*4 = 30, I can use:
np.tensordot(a, b, axes=((-2,-1),(-2,-1)))
>>> array(30)
Now, suppose arrays a and b are 2-by-2-by-2 arrays:
a=np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
b=np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]]])
and I want to do the same operation for each group, i.e. [[1,2],[3,4]] times with [[1,2],[3,4]] then sums the elements, and the same with [[5,6],[7,8]]. The result should be array([ 30, 174]), where 30 = 1*1 + 2*2 + 3*3 + 4*4 and 174 = 5*5 + 6*6 + 7*7 + 8*8. Is there a way to do that using tensordot?
P.S.
I understand in this case you can simply use sum or einsum:
np.sum(a*b,axis=(-2,-1))
>>> array([ 30, 174])
np.einsum('ijk,ijk->i',a,b)
>>> array([ 30, 174])
but this is merely a simplified example, I need to use tensordot because it's faster.
Thanks for any help!!
You can use:
np.diag(np.tensordot(a, b, axes=((1, 2), (1, 2))))to get the result you want. However, usingnp.tensordotor a matrix multiplication is not a good idea in you case as they do much more work than needed. The fact that they are efficiently implemented does not balance the fact that they do much more computation than needed (only the diagonal is useful here).np.einsum('ijk,ijk->i',a,b)does not compute more things than needed in your case. You can try theoptimize=Trueor evenoptimize='optimal'since the parameteroptimizeis set toFalseby default. If this is not fast enough, you can try to use NumExpr so to computenp.sum(a*b,axis=(1, 2))more efficiently (probably in parallel). Alternatively, you can use Numba or Cython too. Both supports fast parallel loops.