Discrepany between PDF cropbox and SVG created out of a PDF page

180 views Asked by At

I am trying to extract the background image of a PDF page to an SVG (using xpdf library). The problem I am facing is that the PDF contains additional images/graphics (presumably outside the cropbox) that are not rendered by PDF readers, but the corresponding SVG contains these images/graphics. I tried setting the viewBox attribute of the SVG to correspond to the cropBox bounds of that PDF page but the resulting SVG still displays some of the graphics objects that are not rendered by PDF. I also tried adding a clip path to the SVG - a rectangular clipping region (with bounds corresponding to PDF cropbox), but this too did not eliminate some of the additional graphics elements no seen in PDF. Any idea on what could be the problem? What is the right way to carry over PDF cropbox to SVG? Btw, the SVGs generated in both the cases mentioned above (viewbox and clipping region approaches) were fairly close in dimensions to the viewable area of the PDF page, and the additional elements were seen only close to the edges. Is it that cropbox dimensions obtained from PDF should not be used directly in SVG?

1

There are 1 answers

0
so2 On BEST ANSWER

Turns out that the problem was due to my code not transforming the PDF cropbox attribute (as given by xpdf) to user coordinates using CTM matrix (also obtainable through xpdf). After applying the transformation, the resulting SVG matches the rendered portion of the PDF page.