I'm trying to get the bounding boxes of generated contours with the Vision framework in Swift. I can do it in opencv in python like:
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
x, y, w, h = cv2.boundingRect(contour)
# do stuff with contours
However, there doesn't seem to be a simple way of doing this in swift.
Here is my current code. (I know the error handling is atrocious, shouldn't be relevant for this)
import AppKit
import Cocoa
import CoreImage
import CoreImage.CIFilterBuiltins
import Vision
import Foundation
let context = CIContext()
if let sourceImage = NSImage.init(named:"example3.png") {
let inputImage = CIImage.init(cgImage: (sourceImage.cgImage(forProposedRect: nil, context: nil, hints: nil)!))
let noiseReductionFilter = CIFilter.gaussianBlur()
noiseReductionFilter.radius = 1
noiseReductionFilter.inputImage = inputImage
let blurred = noiseReductionFilter.outputImage!
blurred.cropped(to: inputImage.extent)
let thresholdFilter = CIFilter.colorThresholdOtsu()
thresholdFilter.inputImage = blurred
let threshold = thresholdFilter.outputImage!
let contourRequest = VNDetectContoursRequest.init()
contourRequest.detectsDarkOnLight = true
contourRequest.maximumImageDimension = 512
let requestHandler = VNImageRequestHandler.init(ciImage:threshold, options: [:])
try requestHandler.perform([contourRequest])
let contoursObservation = contourRequest.results?.first as VNContoursObservation?
print(contoursObservation!.contourCount)
var brackets = [CGRect]()
for i in 0..<contoursObservation!.contourCount {
let contour = try! contoursObservation?.contour(at:i)
let boundingRect = contour!.normalizedPath.boundingBox
print(boundingRect.width)
print(boundingRect.height)
print("aspect ratio", (boundingRect.width/boundingRect.height))
if (boundingRect.width/boundingRect.height) < 0.2 {
brackets.append(boundingRect)
}
}
for rect in brackets {
let cropped = inputImage.cropped(to: rect)
}
}
I was expecting this to give me an image cropped over the bounding box for each of the contours who's bounding box was less than 0.2 (two elements). However, it provided a 1x1 rect. I think the issue is the normalized path, but there's no simple way of "un-normalizing" the path back to coordinates used by sourceImage. Additionally, I had to get the aspect ratio like this because the builtin just gave me the exact same value for each contour.
Here is the example image I've used (the end goal is to go from a screenshot of a matrix to an actual matrix, I got a proof of concept working in python):
I'm all ears if someone has a better pipeline than what I've come up with in python, which is
- get all the contours
- find the two brackets (only supertall contours), and crop the image to just being around those
- get all the contours of the cropped image
- do ocr on each bounding box to get a character for each
- merge characters of numbers together (eg: negative numbers, multidigit numbers)
I can also provide my python code if that'd be helpful.
