I have a large image as an 2D array (let's assume that it is a 500 by 1000 pixels gray scale image). And I have one small image (let's say is it 15 by 15 pixels). I would like to slide the small image over the large one and for a given position of the small image I would like to calculate a measure of similarity between the small image and the underling part of the big image.
I would like to be flexible in choosing a measure of similarity. For example I might want to calculate mean squared deviation or mean absolute deviation or something else (just some operation that takes two matrices of the same size and returns a real number).
The result should be a 2D array. I want to do this operation efficiently (which means that I want to avoid loops).
As a measure of similarity I plan to use the square deviations between the colors of the two images. However, as I have mentioned, it would be nice if I can change the measure (for example to use correlation between the colors).
Is there a function in numpy that can do it?
Import modules
To start off, let's import all the relevant modules/functions that would be used across the various approaches listed in this post -
A. Views based approach for MAD, MSD
Skimage based approaches for computing
mean absolute deviation
:Using scikit-image for getting
slided 4D array of views
and thennp.mean
for the average calculations -Using scikit-image for getting slided 4D array of views and then
np.einsum
for the squared-average calculations -Skimage based approaches for computing
mean squared deviation
:Using the similar techniques, we would have two approaches for
mean squared deviation
-B. Convolution based approach for MSD
Convolution based approaches for computing
mean squared deviations
:Convolution
could be used to used to compute mean squared deviations in a bit of tweaked way. For the case of sum of squared deviations, within each window we are performing elementwise subtractions and then squaring each subtraction and then summing up all those.Let's consider a 1D example for a closer look -
For the first window operation, we would have :
Let's use
(a-b)**2
formula :Thus, we would have for the first window :
Similarly, for the second window :
and so on.
So, we have three parts to these computations -
Squaring of a's and summing of those in sliding windows.
Squaring of b's and summing of those. This stays the same among all windows.
Final piece of puzzle is :
(2*a1*b1, 2*a2*b2, 2*a3*b3)
,(2*a2*b1, 2*a3*b2, 2*a4*b3)
and so on for the first, second and so on windows. This could be computed by a2D
convolution witha
and flipped version ofb
as per the definition ofconvolution
that runs the kernel from the other direction in a sliding and computes elementwise multiplication and sums elements within each window and hence the flipped needed here.Extending these ideas to
2D
case and usingScipy's convolve2d
anduniform_filter
, we would have two more approaches to computemean squared deviations
, like so -C. Skimage based approach for NCC
Skimage based approaches for computing
normalized cross-correlation
:Please note that, since these are cross-correlation values, the closeness between image and template would be characterized by a high output value.
D. OpenCV based approach for various similarity measures
OpenCV offers various
template-matching
methods of classifying templates -E. Views based approach : Custom function
We could use
4D
views as shown earlier in this post and perform the custom similarity measures as NumPy ufuncs along the last two axes.Thus, we get the sliding windows as a 4D array as used previously, like so -
Please note that being a view into the input image, it won't cost anymore on memory. But the later operations would make copy depending on those operations themselves. A comparison operation would result in much lesser memory footprint (8 times lesser on Linux to be exact).
Then, we perform the intended operation between
img_4D
andtmpl
, which in a linearly mapped operation would result in another 4D array followingbroadcasting
. Let's call itimg_sub
. Next up, most probably, we would have some reduction operation to give us a2D
output. Again, in most cases, one of theNumPy ufuncs
could be used here. We need to use this ufunc along the last two axes onimg_sub
. Again, many ufuncs allow us to work on more than one axis at a time. For example, earlier we usedmean
computation along last two axes in one go. Otherwise, we need to operate along those two axes one after another.Example
As an example on how to use let's consider a custom function :
Here, we have
img_W
as the sliding window fromimg
andtmpl
as usual is the template that is slided across the height and width ofimg
.Implemented with two nested loops, we would have :
Now, using
view_as_windows
, we would have a vectorized solution :Runtime test -
Benchmarking : Mean squared deviation
Decent sized datasets :
For bigger datasizes, depending on the system RAM available, we might have to fall back to methods other than
views
method that leaves noticeable memory footprint.Let's test out on bigger datasizes with the remaining approaches -
Summary
When working with known similarity measures , i.e. one of the six methods listed with OpenCV based template matching method and if we have access to OpenCV, that would be the best one.
Without OpenCV, for a special case like mean squared deviation, we could make use of convolution to have decently efficient approaches.
For generic/custom functions and with decent sized datasizes, we could look into
4D
views to have efficient vectorized solutions. For large datasizes, we might want to use one loop and use3D
views instead of4D
with the intention of reducing memory footprint. For really large ones, you might have to fall back to two nested loops.