CIDepthBlurEffect: setting inputFocusRect

1k views Asked by At

I can't manage to find the correct way to set the "inputFocusRect" value from a touch point. I'm grabbing the touch point using

@IBAction func imageTapped(_ sender: UITapGestureRecognizer) {
    var touchPoint = sender.location(in: self.imageView)
    imageTapCoordinates = CIVector(cgPoint: touchPoint)

I pass that to the CIFilter in a separate blur function.

filter?.setValue(imageTapCoordinates, forKey: "inputFocusRect")

However it seems like the coordinate systems don't match up! What's the best way of determining the "inputFocusRect"? Apple doesn't have much documentation on this on their site...

Thanks!

1

There are 1 answers

1
AudioBubble On

Posting this as an answer because of the length and better formatting.

TL;DR;

The likeliest issue seem to be that this filter is expecting two (or more) inputs - an input image and a CoreML or (more likely) a Vision framework output of facial recognition.

My thought is that inputFocusRect, a CIVector (with flipped coordinates where the origin is in the lower left), is where the CoreML or Vision framework output recognized a face in a larger image.

Explanation:

CIDepthBlurEffect is a new filter available in iOS 11 or macOS 10.3. In fact it's so new, Apple hasn't yet updated their CI Filter Reference for it.

I had to resort to dumping what I could find. It's rather simple code. Create an instance of the filter and print out it's attributes.

let filter = CIFilter(name: "CIDepthBlurEffect")
print(filter?.attributes)

Understand, there are better ways to do this - query the device for available filters (all or a specific category), use a specific attribute to get their display name, and more. My intent here was to provide what I could for this question. The "dump" results in an array of dictionary elements or [String:Any] and is unformatted for here. A manual formatting yield this:

[
    "inputRightEyePositions": {
        CIAttributeClass = CIVector;
        CIAttributeDisplayName = "Right Eye Positions";
        CIAttributeType = CIAttributeTypePosition;
    }, 
    "inputCalibrationData": {
        CIAttributeClass = AVCameraCalibrationData;
        CIAttributeDisplayName = CalibrationData;
    }, 
    "inputImage": {
        CIAttributeClass = CIImage;
        CIAttributeDescription = "The image to use as an input image. For filters that also use a background image, this is the foreground image.";
        CIAttributeDisplayName = Image;
        CIAttributeType = CIAttributeTypeImage;
    }, 
    "inputChinPositions": {
        CIAttributeClass = CIVector;
        CIAttributeDisplayName = "Chin Positions";
        CIAttributeType = CIAttributeTypePosition;
    }, 
    "inputLeftEyePositions": {
        CIAttributeClass = CIVector;
        CIAttributeDisplayName = "Left Eye Positions";
        CIAttributeType = CIAttributeTypePosition;
    }, 
    "inputAuxDataMetadata": {
        CIAttributeClass = NSDictionary;
        CIAttributeDisplayName = AuxDataMetadata;
    }, 
    "CIAttributeFilterAvailable_Mac": 10.13, 
    "CIAttributeFilterName": CIDepthBlurEffect,
    "CIAttributeReferenceDocumentation": http://developer.apple.com/library/ios/documentation/GraphicsImaging/Reference/CoreImageFilterReference/index.html#//apple_ref/doc/filter/ci/CIDepthBlurEffect, 
    "inputAperture": {
        CIAttributeClass = NSNumber;
        CIAttributeDefault = 0;
        CIAttributeDisplayName = Aperture;
        CIAttributeMax = 22;
        CIAttributeMin = 0;
        CIAttributeSliderMax = 22;
        CIAttributeSliderMin = 1;
        CIAttributeType = CIAttributeTypeScalar;
    }, 
    "CIAttributeFilterDisplayName": Depth Blur Effect,
    "CIAttributeFilterAvailable_iOS": 11, 
    "inputNosePositions": {
        CIAttributeClass = CIVector;
        CIAttributeDisplayName = "Nose Positions";
        CIAttributeType = CIAttributeTypePosition;
    }, 
    "inputLumaNoiseScale": {
        CIAttributeClass = NSNumber;
        CIAttributeDefault = 0;
        CIAttributeDisplayName = "Luma Noise Scale";
        CIAttributeMax = "0.1";
        CIAttributeMin = 0;
        CIAttributeSliderMax = "0.1";
        CIAttributeSliderMin = 0;
        CIAttributeType = CIAttributeTypeScalar;
    }, 
    "inputScaleFactor": {
        CIAttributeClass = NSNumber;
        CIAttributeDefault = 1;
        CIAttributeDisplayName = "Scale Factor";
        CIAttributeSliderMax = 1;
        CIAttributeSliderMin = 0;
        CIAttributeType = CIAttributeTypeScalar;
    }, 
    "inputFocusRect": {
        CIAttributeClass = CIVector;
        CIAttributeDisplayName = "Focus Rectangle";
        CIAttributeIdentity = "[-8.98847e+307 -8.98847e+307 1.79769e+308 1.79769e+308]";
        CIAttributeType = CIAttributeTypeRectangle;
    }, 
    "inputDisparityImage": {
        CIAttributeClass = CIImage;
        CIAttributeDisplayName = DisparityImage;
    }, 
    "CIAttributeFilterCategories": <__NSArrayI 0x1c46588a0>(
        CICategoryBlur,
        CICategoryVideo,
        CICategoryStillImage,
        CICategoryBuiltIn
    )

I count 12 possible inputs - 2 CIImages, 5 CIVectors, 3 NSNumbers, 1 NSDictionary, and 1 AVCameraCalibrationData.

Some of these are pretty easily deciphered... they deal with the camera, image, image metadata, and face components. Some are less so - but maybe someone more deeply into the Vision framework knows more. (If not, maybe that's a good place to look!)

Specifically, my questions are:

  • What exactly is an inputDisparityImage?
  • What are the various input facial features expecting if nil or a chi position isn't found?

This isn't an answer by many SO standards and if down-voted I'll happily delete it. But I wanted to give you an idea on how to get at what a filter's attributes are, and try to be of help.

My knowledge of CoreML and the Vision framework is minimal. I've dabbled in both an understand the concept behind them - look at an image (or video) and detect/recognize where something is and (usually) give a confidence factor.

Since the other inputs for facial features are also CIVectors, I believe inputFocusRect is a spot where an entire face has been detected. It makes sense that the Vision framework would easily give such output to be tightly integrated with this filter.

Finally, I did try the documentation link within the attributes and it came up empty.