PDF2image on AWS Lambda - resulted PNG has wrong fonts

554 views Asked by TaiT's At 03 May 2021 at 21:16

I am using pdf2image convert_from_bytes on my own PDFs in order to get them in PNG format. The context is AWS Lambda, py 3.8.

...
images = convert_from_bytes(infile,
                            dpi=DPI,
                            fmt=FMT)

for page_num, image in enumerate(images):
    location = "png/" + event.key.split('.')[0] + "-page" + str(page_num) + '.' + FMT

    buffer = BytesIO()
    image.save(buffer, FMT.upper())
    buffer.seek(0)
    ...

Although I am able to generate a PNG "correctly" (meaning with all the info & text), the resulted PNG seems to be using Times New Roman during the process as the font for every single paragraph I have in the PDF. Meanwhile the PDF itself shows correctly with the right fonts and I made sure it has the fonts embedded through properties. The problem happen only when I try to convert it to PNG format. Also I am not trying to use any fancy fonts, only Courrier-Bold and Helvetica.

Here an example of a pdf (part of it):

And the result image:

What did I try so far ?

I tried to convert my PDFs using some online tools to see if this works or if the PDF itself was an issue. The PNG image was correct with the right fonts.
I tried to process some random PDFs with my Lambda function and the generated PNG had correct fonts as well so the conversion seems to work on most PDFs.
I tried with a few different fonts and same result.
I tried to embbed the font in AWS lambda following somewhat this Include custom fonts in AWS Lambda but no luck

But at this point I am clueless. Any idea how can I debug ?

EDIT: PDF font properties

EDIT2: I wrote a small python program to generate a sentence per existing base font.

Then when I pass it through the lambda I get this:

Original Q&A

TechQA.

PDF2image on AWS Lambda - resulted PNG has wrong fonts

There are 0 answers

Related Questions in PYTHON-3.X

Related Questions in PDF

Related Questions in AWS-LAMBDA

Related Questions in FONTS

Related Questions in PDF2IMAGE

Popular Questions

Popular Tags

Trending Questions