I have a 2D numpy array of ints 0 and greater, where the values represent region labels. For example,
array([[9, 9, 9, 0, 0, 0, 0, 1, 1, 1],
[9, 9, 9, 9, 0, 7, 1, 1, 1, 1],
[9, 9, 9, 9, 0, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 0, 2, 2, 1, 1, 1],
[9, 9, 9, 8, 0, 2, 2, 2, 1, 1],
[4, 4, 4, 4, 0, 2, 2, 2, 1, 1],
[4, 6, 6, 4, 0, 0, 0, 0, 0, 0],
[4, 6, 6, 4, 0, 0, 0, 0, 0, 0],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5],
[4, 4, 4, 4, 5, 5, 5, 5, 5, 5]])
I would like the indices equal to 0 (i.e. zero-regions) to take on the value most-common in their neighborhood. The operation would essentially close the zero-regions. I've tried multiple variations of dilation, erosion, grey-closing, and other morphology operations, but I cannot completely eliminate the zero-regions (without awkwardly blending the other regions). A decent approach could be to define a kernel that convolves only over the zeros, and sets the value with the most common label in the filter area. I'm unsure how to implement this though.
One vectorized approach is proposed here. Steps are :
Get kernel sized 2D sliding windows, leading to 4D array. We can use
skimage's view_as_windows
to get those as view and thus avoid creating any extra memory for this.Select the windows which are centered at zeros by indexing into the 4D array. This forces a copy. But assuming number of zeros is a relatively smaller number than the total number of elements in input array, this should be okay.
For each of those selected windows, offset each window with a proper offset with the idea of using
np.bincount
to perform counting. Thus, usebincount
and get the max count excluding the zeros. The argmax for the max count should be our guy!Here's the implementation covering those steps -
Sample run -
As seen, we are not solving for the boundary cases. If needed to do, use a zero-padded array as the input array, something like this :
np.pad(a, (k//2,k//2), 'constant')
, withk
as the kernel size (=3
for the sample).