I'm trying to execute a simple DepthwiseConvolution kernel with Metal Performance Shaders on MacOS and have problems with it. First I initialize an MPSImage
(called debugInputImage
) with proper size filled with some number, say 1.0
. Then I create my convolutional kernel:
convolution_depthwise_0 = MPSCNNConvolution(device: device,
weights: datasource_depthwise_0)
Where datasource_depthwise_0
is an instance of MPSCNNConvolutionDataSource
with the following descriptor:
func descriptor() -> MPSCNNConvolutionDescriptor {
var desc = MPSCNNDepthWiseConvolutionDescriptor(kernelWidth: 3,
kernelHeight: 3,
inputFeatureChannels: 32,
outputFeatureChannels: 32)
return desc
}
This is how I initialise the input image:
let imageDescriptor = MPSImageDescriptor(channelFormat: .float16,
width: 256, height: 256, featureChannels: 32)
debugInputImage = MPSImage(device: device,
imageDescriptor: imageDescriptor)
var arrayOfOnes = Array(repeating: Float(1.0),
count: imageDescriptor.width * imageDescriptor.height
* imageDescriptor.featureChannels)
let arrayOfOnes16 = toFloat16(&arrayOfOnes, size: arrayOfOnes.count)
debugInputImage.writeBytes(arrayOfOnes16,
dataLayout: MPSDataLayout.HeightxWidthxFeatureChannels, imageIndex: 0)
When I run all of this:
let commandBuffer = commandQueue.makeCommandBuffer()!
let outImage = convolution_depthwise_0.encode(commandBuffer: commandBuffer,
sourceImage: debugInputImage)
And get this error (at this line let outImage = convolution_depthwise_0.encode(...
):
validateComputeFunctionArguments:860: failed assertion `Compute
Function(depthwiseConvolution): missing threadgroupMemory binding
at index 0 for lM[0].'
For regular convolution everything is fine, only for Depthwise I get this problem.
What could be the reason for that error?
System: MacOS 10.14, XCode 10.1 beta 3
Only MPSCNNDepthWiseConvolutionDescriptor doesn't work. I have no problems with MPSCNNConvolutionDescriptor. I also have no problems on iOS, only Mac OS.