I'd like to use nolearn's occlusion_heatmap
function. However, since it takes 2-3 seconds to create a single heat map this approach is not practical for my data set with several thousand images.
I assume that one could speedup the function by avoiding the nested for
-loops using numpy. Any hints would be much appreciated.
def occlusion_heatmap(net, x, target, square_length=7):
"""An occlusion test that checks an image for its critical parts.
In this function, a square part of the image is occluded (i.e. set
to 0) and then the net is tested for its propensity to predict the
correct label. One should expect that this propensity shrinks of
critical parts of the image are occluded. If not, this indicates
overfitting.
Depending on the depth of the net and the size of the image, this
function may take awhile to finish, since one prediction for each
pixel of the image is made.
Currently, all color channels are occluded at the same time. Also,
this does not really work if images are randomly distorted by the
batch iterator.
See paper: Zeiler, Fergus 2013
Parameters
----------
net : NeuralNet instance
The neural net to test.
x : np.array
The input data, should be of shape (1, c, x, y). Only makes
sense with image data.
target : int
The true value of the image. If the net makes several
predictions, say 10 classes, this indicates which one to look
at.
square_length : int (default=7)
The length of the side of the square that occludes the image.
Must be an odd number.
Results
-------
heat_array : np.array (with same size as image)
An 2D np.array that at each point (i, j) contains the predicted
probability of the correct class if the image is occluded by a
square with center (i, j).
"""
if (x.ndim != 4) or x.shape[0] != 1:
raise ValueError("This function requires the input data to be of "
"shape (1, c, x, y), instead got {}".format(x.shape))
if square_length % 2 == 0:
raise ValueError("Square length has to be an odd number, instead "
"got {}.".format(square_length))
num_classes = get_output_shape(net.layers_[-1])[1]
img = x[0].copy()
bs, col, s0, s1 = x.shape
heat_array = np.zeros((s0, s1))
pad = square_length // 2 + 1
x_occluded = np.zeros((s1, col, s0, s1), dtype=img.dtype)
probs = np.zeros((s0, s1, num_classes))
# generate occluded images
for i in range(s0):
# batch s1 occluded images for faster prediction
for j in range(s1):
x_pad = np.pad(img, ((0, 0), (pad, pad), (pad, pad)), 'constant')
x_pad[:, i:i + square_length, j:j + square_length] = 0.
x_occluded[j] = x_pad[:, pad:-pad, pad:-pad]
y_proba = net.predict_proba(x_occluded)
probs[i] = y_proba.reshape(s1, num_classes)
# from predicted probabilities, pick only those of target class
for i in range(s0):
for j in range(s1):
heat_array[i, j] = probs[i, j, target]
return heat_array