Do I have to lock a CVPixelBuffer produced from AVCaptureVideoDataOutput

1.2k views Asked by At

I have a AVCaptureVideoDataOutput producing CMSampleBuffer instances passed into my AVCaptureVideoDataOutputSampleBufferDelegate function. I want to efficiently convert the pixel buffers into CGImage instances for usage elsewhere in my app.

I have to be careful not to retain any references to these pixel buffers or the capture session will start dropping frames for reason OutOfBuffers. Also, if the conversion takes too long then then frames will be discarded for reason FrameWasLate.

Previously I tried using a CIContext to render the CGImage but this proved to be too slow when capturing above 30 FPS, and I want to capture at 60 FPS. I tested and got up to 38 FPS before frames started getting dropped.

Now I am attempting to use a CGContext and the results are better. I'm still dropping frames, but significantly less frequently.

public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {

    // Capture at 60 FPS but only process at 4 FPS, ignoring all other frames
    let timestamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer)
    guard timestamp - lastTimestamp >= CMTimeMake(value: 1, timescale: 4) else { return }

    // Extract pixel buffer
    guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }

    // Lock pixel buffer before accessing base address
    guard kCVReturnSuccess == CVPixelBufferLockBaseAddress(imageBuffer, .readOnly) else { return }
    defer { CVPixelBufferUnlockBaseAddress(imageBuffer, .readOnly) }

    // Use CGContext to render CGImage from pixel buffer
    guard let cgimage = CGContext(data: CVPixelBufferGetBaseAddress(imageBuffer),
                                  width: CVPixelBufferGetWidth(imageBuffer),
                                  height: CVPixelBufferGetHeight(imageBuffer),
                                  bitsPerComponent: 8,
                                  bytesPerRow: CVPixelBufferGetBytesPerRow(imageBuffer),
                                  space: cgColorSpace,
                                  bitmapInfo: cgBitmapInfo).makeImage() else { return }

    // Do something with cgimage...
}

I was curious and next tried this without locking the pixel buffer base address. When I comment out those two lines, I stop dropping frames completely without any noticeable repercussions. It seems that the lock mechanism was taking so long that frames were being dropped, and removing the mechanism significantly reduced the function's running time and allowed all frames to be handled.

Apple's documentation explicitly states that calling CVPixelBufferLockBaseAddress is required prior to CVPixelBufferGetBaseAddress. However, because the AVCaptureVideoDataOutput is using a pre-defined pool of memory for its sample buffers, perhaps the base address isn't subject to change like would normally be the case.

Can I skip locking the base address here? What is the worst that could happen if I don't lock the base address in this specific scenario?

2

There are 2 answers

0
willbattel On BEST ANSWER

This question was ill-founded from the start because I neglected to test the actual image result from skipping the lock. As stated in the question, when I lock the base address prior to initializing the CGContext, the makeImage render would takes approximately 17 milliseconds. If I skip the locking and go straight to the CGContext then the makeImage takes 0.3 milliseconds.

I had wrongly interpreted this speed difference to mean that the rendering was being accelerated by the GPU in the latter case. However, what was actually happening was the CVPixelBufferGetBaseAddress was returning nil and the makeImage was rendering no data- producing a purely white CGImage.

So, in short, the answer to my question is yes. The base address must be locked.

Now I am off to figure out how to speed this up. I am capturing at 60 FPS which means I want my rendering to take less than 16 milliseconds if possible so as to drop the CMSampleBuffer reference before the next one arrives.

5
Frank Rupprecht On

From what you describe you really don't need to convert to CGImage at all. You can do all processing within a Core Image + Vision pipeline:

  1. Create a CIImage from the camera's pixel buffer with CIImage(cvPixelBuffer:).
  2. Apply filters to the CIImage.
  3. Use a CIContext to render the filtered image into a new CVPixelBuffer. For best performance use a CVPixelBufferPool for creating those target pixel buffers.
  4. Pass the pixel buffer to Vision for analysis.
  5. If Vision decides to keep the image, use the same CIContext to render the pixel buffer (wrap it into a CIImage again like in 1.) into a target format of your choice, for instance with context.writeHEIFRepresentation(of:...).

Only in the end will the image data be transferred to the CPU side.