i am currently doing a case study on the improved performance of a separable filter vs that of a square filter. I understand the mathematics behind the time complexity difference, however i have run into a problem with the real world implementation.
so basically what i have done is write a loop which implements my filter image function given by:
function imOut = FilterImage(imIn, kernel, boundFill, outputSize)
VkernelOffset = floor(size(kernel,1)/2);
HkernelOffset = floor(size(kernel,2)/2);
imIn = padarray(imIn, [VkernelOffset HkernelOffset], boundFill);
imInPadded = padarray(imIn, [VkernelOffset HkernelOffset], boundFill);
imOut = zeros(size(imIn));
kernelVector = reshape(kernel,1, []);
kernelVector3D = repmat(kernelVector, 1, 1, size(imIn,3));
for row = 1:size(imIn,1)
Vwindow = row + size(kernel,1)-1;
for column = 1:size(imIn,2)
Hwindow = column + size(kernel,2)-1;
imInWindowVector = reshape( ...
imInPadded(row:Vwindow, column:Hwindow, :),1,[],size(imIn,3));
imOut(row,column, :) = sum((imInWindowVector.*kernelVector3D),2);
end
end
ouputSize = lower(outputSize);
if strcmp(outputSize, 'same')
imOut = imOut((1+VkernelOffset):(size(imOut,1)-VkernelOffset), ...
(1+HkernelOffset):(size(imOut,2)-HkernelOffset), : );
elseif strcmp(outputSize, 'valid')
imOut = imOut((1+VkernelOffset*2):(size(imOut,1)-VkernelOffset*2), ...
(1+HkernelOffset*2):(size(imOut,2)-HkernelOffset*2), : );
end
end
I wrote another script which carries out the following two sets of commands on a 740x976 greyscale image and logs their processing time:
for n = 1:25
dim(n) = 6*n + 1;
h=fspecial('gaussian',dim(n), 4);
tic;
Im = FilterImage(I,h,0,'full');
tM(n) = toc;
h1 = fspecial('gaussian', [dim(n) 1], 4);
h2 = fspecial('gaussian', [1 dim(n)], 4);
tic;
It = FilterImage(I,h1,0,'full');
Is = FilterImage(It,h2,0,'full');
tS(n) = toc;
end
after plotting the respective time required i get the following result:
My problem is, Why is the separable method slower up to kernel matrices of size 49x49, and only shows improved speed from kernel sizes of 55x55 upwards, is something wrong with my image filter code?
p.s. the image filter code was designed for 3D images to take into account colour depth, however for the speed test i am using a greyscale image converted to double using im2double.
p.s.2 so as mentioned below, for comparison i carried out the same process using MATLAB's native conv2 function, and the results where as you'd expect, and also incredibly faster...
thanks
It seems like an optimization error.
I'd use the function
conv2
instead.Let's write a sample code:
Try those in a loop where the length of
vFilterCoeff
(Row Vector!!!) is getting bigger.update us what are the result now.