I am trying to read a PDF file. Below callbacks also print the messages but there's nothing I can get out of the PDF.
    let pdfBundlePath = Bundle.main.path(forResource: "sample", ofType: "pdf")
    let pdfURL = URL.init(fileURLWithPath: pdfBundlePath!)
    let pdf = CGPDFDocument(pdfURL as CFURL)        
    let operatorTableRef = CGPDFOperatorTableCreate()
    CGPDFOperatorTableSetCallback(operatorTableRef!, "BT") { (scanner, info) in
        print("Begin text object")
    }
    CGPDFOperatorTableSetCallback(operatorTableRef!, "ET") { (scanner, info) in
        print("End text object")
    }
    CGPDFOperatorTableSetCallback(operatorTableRef!, "Tf") { (scanner, info) in
        print("Select font")
    }
    CGPDFOperatorTableSetCallback(operatorTableRef!, "Tj") { (scanner, info) in
        print("Show text")
    }
    CGPDFOperatorTableSetCallback(operatorTableRef!, "TJ") { (scanner, info) in
        print("Show text, allowing individual glyph positioning")
    }
        let page = pdf!.page(at: 1)
        let stream = CGPDFContentStreamCreateWithPage(page!)
        let scanner = CGPDFScannerCreate(stream, operatorTableRef, nil)
        CGPDFScannerScan(scanner)
        CGPDFScannerRelease(scanner)
        CGPDFContentStreamRelease(stream)
Output:
Begin text object
Select font
Show text, allowing individual glyph positioning
End text object
// the same output for at least 10 or more times.
But I am not sure how to get the actual string out of this? Any suggestion would be appreciated.
 
                        
I have pdf with "hello, world" text (created with export as pdf from TextEdit)
This callback function
prints me
I think it demonstrates, that getting the String is possible :-), at least for Latin alphabet.
for Tj operator, the callback function could be as simple as
WARNING! to properly show all characters, it is neccesary use font information, but that is a different story. For Latin characters this solutions should work as is.
To be able to 'extract' all strings, all text-showing operators must be implemented
UPDATE Because PDFKit is available on both apple platforms (from iOS11) I suggest to use it for text extraction. The process is much straightforward