How do I get bounding boxes for VNContours?

36 views Asked by At

I'm trying to get the bounding boxes of generated contours with the Vision framework in Swift. I can do it in opencv in python like:

contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
        x, y, w, h = cv2.boundingRect(contour)
        # do stuff with contours

However, there doesn't seem to be a simple way of doing this in swift.

Here is my current code. (I know the error handling is atrocious, shouldn't be relevant for this)

import AppKit
import Cocoa
import CoreImage
import CoreImage.CIFilterBuiltins
import Vision
import Foundation
let context = CIContext()



if let sourceImage = NSImage.init(named:"example3.png") {
    let inputImage = CIImage.init(cgImage: (sourceImage.cgImage(forProposedRect: nil, context: nil, hints: nil)!))
    
    
    let noiseReductionFilter = CIFilter.gaussianBlur()
    noiseReductionFilter.radius = 1
    noiseReductionFilter.inputImage = inputImage
    let blurred = noiseReductionFilter.outputImage!
    blurred.cropped(to: inputImage.extent)
    
    let thresholdFilter = CIFilter.colorThresholdOtsu()
    thresholdFilter.inputImage = blurred
    let threshold = thresholdFilter.outputImage!
    
    let contourRequest = VNDetectContoursRequest.init()
    
    contourRequest.detectsDarkOnLight = true
    contourRequest.maximumImageDimension = 512
    
    let requestHandler = VNImageRequestHandler.init(ciImage:threshold, options: [:])
    
    try requestHandler.perform([contourRequest])
    let contoursObservation = contourRequest.results?.first as VNContoursObservation?
    print(contoursObservation!.contourCount)
    
    var brackets = [CGRect]()
    for i in 0..<contoursObservation!.contourCount {
        let contour = try! contoursObservation?.contour(at:i)
        let boundingRect = contour!.normalizedPath.boundingBox
        print(boundingRect.width)
        print(boundingRect.height)
        print("aspect ratio", (boundingRect.width/boundingRect.height))
        if (boundingRect.width/boundingRect.height) < 0.2 {
            brackets.append(boundingRect)
        }
    }
    for rect in brackets {
        let cropped = inputImage.cropped(to: rect)
    }

}

I was expecting this to give me an image cropped over the bounding box for each of the contours who's bounding box was less than 0.2 (two elements). However, it provided a 1x1 rect. I think the issue is the normalized path, but there's no simple way of "un-normalizing" the path back to coordinates used by sourceImage. Additionally, I had to get the aspect ratio like this because the builtin just gave me the exact same value for each contour.

Here is the example image I've used (the end goal is to go from a screenshot of a matrix to an actual matrix, I got a proof of concept working in python):

example image

I'm all ears if someone has a better pipeline than what I've come up with in python, which is

  1. get all the contours
  2. find the two brackets (only supertall contours), and crop the image to just being around those
  3. get all the contours of the cropped image
  4. do ocr on each bounding box to get a character for each
  5. merge characters of numbers together (eg: negative numbers, multidigit numbers)

I can also provide my python code if that'd be helpful.

0

There are 0 answers