I'm getting the depth data from the TrueDepth camera, and converting it to a grayscale image. (I realize I could pass the AVDepthData
to a CIImage
constructor, however, for testing purposes, I want to make sure my array is populated correctly, therefore manually constructing an image would ensure that is the case.)
I notice that when I try to convert the grayscale image, I get weird results. Namely, the image appears in the top half, and the bottom half is distorted (sometimes showing the image twice, other times showing nonsense).
For example:
Expected output (i.e.
CIImage(depthData: depthData)
):
Actual output (20% of the time):
Actual output (80% of the time):
I started with Apple's sample code and tried to extract the pixel in the CVPixelBuffer.
let depthDataMap: CVPixelBuffer = ...
let width = CVPixelBufferGetWidth(depthDataMap) // 640
let height = CVPixelBufferGetHeight(depthDataMap) // 480
let bytesPerRow = CVPixelBufferGetBytesPerRow(depthDataMap) // 1280
let baseAddress = CVPixelBufferGetBaseAddress(depthDataMap)
assert(kCVPixelFormatType_DepthFloat16 == CVPixelBufferGetPixelFormatType(depthDataMap))
let byteBuffer = unsafeBitCast(baseAddress, to: UnsafeMutablePointer<Float16>.self)
var pixels = [Float]()
for row in 0..<height {
for col in 0..<width {
let byteBufferIndex = col + row * bytesPerRow
let distance = byteBuffer[byteBufferIndex]
pixels += [distance]
}
}
// TODO: render pixels as a grayscale image
Any idea what is wrong here?
TL;DR
You should always unwrap the call to
CVPixelBufferGetBaseAddress
so that you don't miss important warnings.Turns out the problem is how the value inside the byteBuffer is being accessed. If instead of using
unsafeBitCast()
you use the method Apple uses in their example (assumingMemoryBound
), you will get the correct results.Although it looks like:
... should behave the same as:
... the two are in fact very different, with the former producing the bad results mentioned above, and the latter producing good results.
The final (fixed) code should look like this:
I'm actually not sure why this is the case because we know:
This seems to imply that there are no extra bytes at the end of a row, and we should be able to read it as one giant array.
If anyone knows why the former fails, please share!
Update:
If you force unwrap the call to
CVPixelBufferGetBaseAddress
:... things start to make a bit more sense.
Namely, you will see a warning on this line:
I guess the results I seeing were related to the "undefined behavior" warning.
The lesson, therefore, is that you should always unwrap the result of
CVPixelBufferGetBaseAddress
before attempting to use it (e.g. inunsafeBitCast
).