Generating PDF file with PuppeteerSharp from webpage containing images linking to Azure Blob storage

555 views Asked by At

I have a web page that contains href tags pointing to pictures stored in Azure Blob storage. The Azure container is a private and the link generated to access each images is performed using Azure SAS token.The format of a href link is similar to https://myblob.blob.core.windows.net/mycontainer/myfolder%2Fmyfile.jpeg?sv=2019-12-12&st=2020-10-13T18%3A52%3A48Z&se=2020-10-13T18%3A58%3A48Z&sr=b&sp=r&sig=P5JRdwKa4GkbIFF55sWywOe4vnPnWOCoSf29UHYmNPA%3D

When generating the PDF using Puppeteer sharp using WaitUntilNavigation.Networkidle0, I didn't succeed in retrieving the images: enter image description here

I also tested each generated secured SAS link and they work without problem. I also replaced each href link with a base 64 data encoded image and it works great.

I tested PDF generation using online Puppeteer service based on Nodejs (https://try-puppeteer.appspot.com/) and it works like a charm. So there seems to be an issue with puppeteersharp version (v2.0.4).

Any idea on what could be the issue?

1

There are 1 answers

0
Omid B. On

After struggling with the issue for several hours, we finally located the issue. It is not related with Puppeteer that works like a charm but rather with the way a private blob storage container handles authentication: as our request contained an Authorization HTTP Header with a bearer token required by our own app, this header was sent by Chromium while retrieving remote images from the blob container. Unfortunately Azure service tried to handle that token and rejected our request.

How did we identify that? By connecting a chrome debugger to the Chromium instance and by checking the logs. Indeed it is possible to launch Puppeteer with a remote debugging port.