I want to define some function on a matrix X. For example mean(pow(X - X0, 2)), where X0 is another matrix (X0 is fixed / constant). To make it more specific, let's assume that both X and X0 are 10 x 10 matrices. The result of the operation is a real number.
Now I have a big matrix (let's say 500 x 500). I want to apply the operation defined above to all 10 x 10 sub-matrices of the "big" matrix. In other words, I want to slide the 10 x 10 window over the "big" matrix. For each location of the window, I should get a real number. So, as a final result, I need to get a real-valued matrix (or 2D tensor) (its shape should be 491 x 491).
What I want to have is close to a convolutional layer but not exactly the same because I want to use a mean squared deviation instead of a linear function represented by a neuron.
This is only a Numpy solution, hope it suffices. I assume that your function is made up of an operation on the matrix elements and of a mean, i.e. a scaled sum. Hence, it is sufficient to look at
Yas inSo we only need to deal with determining a windowed mean. Note that for the 1D case the matrix product with an appropriate vector of ones can be determined for calculating the mean, e.g.
The 2D case is analogous, but with two matrix multiplications, one for the rows and one for the columns. E.g.
To construct a sliding window, we need to build a matrix of shifted windows, e.g.
to calculate all the sums
A full example for a
(n, n)matrix with a(m, m)window would beAn alternative but probably slightly less efficient way for building
HisUpdate
Calculating the mean squared deviation is also possible with that approach. For that, we generalize the vector identity
|x-x0|^2 = (x-x0).T (x-x0) = x.T x - 2 x0.T x + x0.T x0(a space denotes a scalar or matrix multiplication and.Ta transposed vector) to the matrix case:We assume
Wis a(m,n)matrix containing a block(m.m)identity matrix, which is able to extract the(k0,k1)-th(m,m)sub-matrix byY = W Z W.T, whereZis the(n,n)matrix containing the data. Calculating the differenceis straightforward, where
X0andDis a(m,m)matrix. The square-root of the squared sum of the elements is called Frobenius norm. Based on those identities, we can write the squared sum asThe term
Y0can be interpreted asH Z H.Tfrom the method from above.The termY1can be interpreted as a weighted mean onZandY2is a constant, which only needs to be determined once. Thus, a possible implementation would be: