I am trying to convert pdf to image using pymupdf. It is converting but the issue is it is changing the output size of the image. I want to retain the shape of the image as the input pdf.
def split_pdf_mu(src):
doc = fitz.open(src) # open document
file_name = os.path.basename(src)
dest = os.path.expanduser("~")+'/tmp/splitted/'
split_paths = []
for page_index, page in enumerate(doc): # iterate through the pages
# zoom = 300/72 # zoom factor
# mat = fitz.Matrix(zoom, zoom)
# pix = page.get_pixmap(matrix = mat)
pix = page.get_pixmap() # render page to an image
dest_path = os.path.join(dest, f'{file_name}_page{page_index}.png')
# pix.save(dest_path)
pix.pil_save(dest_path, format="PNG",optimize=False)
split_paths.append(dest_path)
return split_paths
I have tried using zoom factor but that doesn't seem to help. Can anyone help convert pdf and produce image of the same dimensions as the original pdf document.
Using PyMuPDF you have 2 options to control the dimensions of the output image
using the dpi (Dots per Inch).
Now let's look what dpi actually means.
In general Pdfs have physical dimensions instead of a resolution ( like images do). e.g. a pdf that is created at a standard A4 paper size will have 8.3 x 11.7 inch dimensions.
Now, what you are trying is to translate this to pixels so you need a conversion unit. That is what dpi means here. e.g. in the previous example if we use 90 dpi our A4 sized pdf will result in an image of size
8.3 * 90 x 11.7 * 90 pixels
resulting in an image of747 x 1.053 pixels
. Thus keeping the original aspect ratio and scaling the dimensions of a pdf to pixels of an image2.using zoom
In zoom we still need to know based on our input pdf what will the image size will be Well here you will observe that it's the same conversion as before only this time the dpi is set to 72 by default but with the difference that in the conversion now we also use the zoom variable as well. Here in our example we have set it to 2. So our output image will be
this will result in an output image of
11.952 x 845 pixels
So if you want to know the exact image resolution you want to end up with I would suggest you go with the dpi way because you can adjust the dpi to end up with very specific numbers for the output image dimensions.
E.g. if you want to end up with an output image of 1700, 2200 from a pdf you can just figure out the correct dpi. For this example it would be
and this would round the dpi to be as close as you could get to the specific pixel dimensions you want.