How to extract table data from an image using Vision Framework in iOS?

Question

How to extract table data from an image using Vision Framework in iOS?

647 views Asked by Vidhya Sri At 13 May 2021 at 08:15

With iOS Vision Framework, I am able to perform OCR and fetch recognized text from an image using

VNRecognizedTextObservation

Now let say, I have an image in which there is some text paragraph along with a table. The table has many columns and associated rows with it(Refer below image). Is it possible to recognize a particular column's key and values from the table using Vision?

For example, I want to fetch 2014 Retail Sales numbers alone from the below image using Vision. How to do this? Can we use both Vision and CoreML to do this?

Original Q&A

There are 1 answers

**Surajkumbhar904** · Answer 1 · 2024-02-09T14:01:36+00:00

Yes it is possible using the vision

 guard let cgImage = image.cgImage else { return }
    let imageRequestHandler = VNImageRequestHandler(cgImage: cgImage, orientation: .right)
    
    let size = CGSize(width: cgImage.width, height: cgImage.height) // note, in pixels from `cgImage`; this assumes you have already rotate, too
    let bounds = CGRect(origin: .zero, size: size)
    // Create a new request to recognize text.
    let request = VNRecognizeTextRequest { [self] request, error in
        guard
            let results = request.results as? [VNRecognizedTextObservation],
            error == nil
        else { return }
        
        let rects = results.map {
            convert(boundingBox: $0.boundingBox, to: CGRect(origin: .zero, size: size))
        }

    func convert(boundingBox: CGRect, to bounds: CGRect) -> CGRect {
    let imageWidth = bounds.width
    let imageHeight = bounds.height
    
    // Begin with input rect.
    var rect = boundingBox
    
    // Reposition origin.
    rect.origin.x *= imageWidth
    rect.origin.x += bounds.minX
    rect.origin.y = (1 - rect.maxY) * imageHeight + bounds.minY
    
    // Rescale normalized coordinates.
    rect.size.width *= imageWidth
    rect.size.height *= imageHeight
    
    return rect
}

you can detect the column's key bounding box and extend the bounding box height

 var targetBoundingBox: CGRect?
        var targetWord = "Retailer"
        
        for result in results {
            if let candidate = result.topCandidates(1).first, candidate.string.lowercased().contains(targetWord.lowercased()) {
                targetBoundingBox = convert(boundingBox: result.boundingBox, to: CGRect(origin: .zero, size: size))

                targetBoundingBox?.size.height += bounds.maxY
                break
            }
        }

        if let targetBoundingBox = targetBoundingBox {
            print("Bounding box of '\(targetWord)': \(targetBoundingBox)")
            var textInsideTargetBox = ""
            for result in results {
                let boundingBox = convert(boundingBox: result.boundingBox, to: CGRect(origin: .zero, size: size))
                if targetBoundingBox.intersects(boundingBox), let text = result.topCandidates(1).first?.string {
                    textInsideTargetBox += "\(text) "
                }
            }
            print(textInsideTargetBox,"string")
            let format = UIGraphicsImageRendererFormat()
            format.scale = 1
            let final = UIGraphicsImageRenderer(bounds: bounds, format: format).image { _ in
                image.draw(in: bounds)
                UIColor.green.setStroke()
                //                    for rect in rects {
                let path = UIBezierPath(rect: targetBoundingBox)
                path.lineWidth = 9
                path.stroke()
                //                    }
            }
            DispatchQueue.main.async { [self] in
                resultImage.image = final
            }

        } else {
            print("Bounding box of '\(targetWord)' not found")

        }

TechQA.

How to extract table data from an image using Vision Framework in iOS?

There are 1 answers

Related Questions in IOS

Related Questions in OPENCV

Related Questions in COMPUTER-VISION

Related Questions in COREML

Related Questions in VISIONKIT

Popular Questions

Popular Tags

Trending Questions