I was able to identify squares from a images using VNDetectRectanglesRequest. Now I want those rectangles to store as separate images (UIImage or cgImage). Below is what I tried.
let rectanglesDetection = VNDetectRectanglesRequest { request, error in
rectangles = request.results as! [VNRectangleObservation]
rectangles.sort{$0.boundingBox.origin.y > $1.boundingBox.origin.y}
for rectangle in rectangles {
let rect = rectangle.boundingBox
let imageRef = cgImage.cropping(to: rect)
let image = UIImage(cgImage: imageRef!, scale: image!.scale, orientation: image!.imageOrientation)
checkBoxImages.append(image)
}
Can anybody point out what's wrong or what should be the best approach?
Update 1
At this stage, I'm testing with an image that I added to the assets.
With this image I get 7 rectangles as observations as each for each cell and one for the table margin.
My task is to identify the text inside in each rectangle and my approach is to send VNRecognizeTextRequest for each rectangle that has been identified. My real scenario is little complicated than this but I want to at least achieve this before going forward.
Update 2
for rectangle in rectangles {
let trueX = rectangle.boundingBox.minX * image!.size.width
let trueY = rectangle.boundingBox.minY * image!.size.height
let width = rectangle.boundingBox.width * image!.size.width
let height = rectangle.boundingBox.height * image!.size.height
print("x = " , trueX , " y = " , trueY , " width = " , width , " height = " , height)
let cropZone = CGRect(x: trueX, y: trueY, width: width, height: height)
guard let cutImageRef: CGImage = image?.cgImage?.cropping(to:cropZone)
else {
return
}
let croppedImage: UIImage = UIImage(cgImage: cutImageRef)
croppedImages.append(croppedImage)
}
My image width and height is
width = 406.0 height = 368.0
I've taken my debug interface for you to get a proper understand.
As @Lasse mentioned, this is my actual issue with screenshots.


This is just a guess since you didn't state what the actual problem is, but probably you're getting a zero-sized image for each VNRectangleObservation.
The reason is: Vision uses a normalized coordinate space from 0.0 to 1.0 with lower left origin.
So in order to get the correct rectangle of your original image, you need to convert the rect from Normalized Space to Image Space. Luckily there is VNImageRectForNormalizedRect(::_:) to do just that.