iOS: Pdf scanner get coordinates of text

828 views Asked by At

I am using CGPDFScanner to scan the pdf. Should I use Td operator to find positions of text? Can I have an example that how to use this operator to get positions of the text? Current I have used Tj and TJ operator to find the text. Now I would like to know position of each word in a single page of pdf. How can I do that?

Thanks

2

There are 2 answers

0
user1118321 On

To get the coordinates of the text you need to keep track of the text transformation matrix. See section 5.3.1, "Text Positioning Operators" of the PDF 1.4 Reference. (I'm not sure if later versions of the reference number things the same or not.) While the Td operator will set the current translation in the text matrix, there are other operators that affect the text matrix and other text state, as well. You need to keep track of the text matrix as the file is processed. The Tm operator will directly set the text matrix. The TD operator moves to the next line and offsets by the x and y parameters. T* just moves to the next line.

0
Kappe On

Look this library: https://github.com/KurtCode/PDFKitten/ search and highlight text