Using PyPDF Rect as area input to Tabula

42 views Asked by At

I'm trying to use the coordinates of the Rect object that PyPDF gives me as an area input to Tabula, but it seems like the coordinates are not the same, and I can't find information about the measure units those libraries use.

For example

I can find the same text rectangle in PyPDF and Tabula using the following coordinates:

PyPDF: Upper left = x: 208, y: 574; Lower right = x: 494, y: 585

Tabula: Upper left = x: 200, y: 240; Lower right = x: 510, y: 270

But I can't find a way to convert from PyPDF rectangle to Tabula area. Can anyone help me?

1

There are 1 answers

1
Wallys Ferreira On

I figured it out. Seems like they use the same measure unit, just needed to invert the y axis using the page height.