Is it possible to request jpg version of an image when downloading with puppeteer

229 views Asked by At

I'm a noob here, be gentle with me :)

I'm using puppeteer to extract data from a suppliers website (they've given me permission to do this) and import into WordPress / WooCommerce . I can get the product data no problem but I'm hitting a wall with images.

I can extract the images fine. The problem I'm facing is that the website is serving some images in webp format. From what I understand, the server would/should have both .jpg and .webp images and if the browser supports it, it serves the webp image.

So the URL that I get the image from is something like "https://example.com/images/myimage.jpg" but it's actually giving me the webp image. I need to know at the point of getting the image from the site if I'm being given the jpg or webp version so I can save it appropriately and then work out what to do with it.

I'm planning to convert these images using sharp when I know what extension I've actually got

So I guess a few questions are;

  1. Is it possible to force puppeteer to NOT serve me webp format and give me just jpg? OR
  2. Is it possible when extracting the image to see what type it actually is before I save it so I know what extension to save it as?
  3. Is it possible for sharp to identify the image type before I try to convert?

Thanks, Dan

1

There are 1 answers

0
Dan Lloyd On BEST ANSWER

Looks like puppeteer allows you to set a user agent. If I set this to a browser that does not support webp images, I'm given the jpg images by default

page.setUserAgent('Mozilla/5.0 CK={} (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko')