Extract stamp from PDF in python as jpg

599 views Asked by At

I got a problem. Some images that I need to extract from pdf are not as image to pdf, but as Stamp. I got lots of PDF, and I want to extract all stamps and all images from these PDF (I had a script for all images but not for all stamps). I just want the JPG of these stamps, but I don't know how I can parse these PDF and extract all Stamp in JPG format. I want to do this in python 3.

Thanks a lot! Regards,

1

There are 1 answers

0
Erwan CUINET On

With use or pyMuPDF you can do something like this :

#!/usr/bin/python

import fitz

pdf_document = fitz.open("file.pdf")


for current_page in range(len(pdf_document)):
    for annot in pdf_document[current_page].annots():
            xref = annot.get_pixmap()
            xref.writePNG("page%s-%s.png" % (current_page, xref))