I'm a noob here, be gentle with me :)
I'm using puppeteer to extract data from a suppliers website (they've given me permission to do this) and import into WordPress / WooCommerce . I can get the product data no problem but I'm hitting a wall with images.
I can extract the images fine. The problem I'm facing is that the website is serving some images in webp format. From what I understand, the server would/should have both .jpg and .webp images and if the browser supports it, it serves the webp image.
So the URL that I get the image from is something like "https://example.com/images/myimage.jpg" but it's actually giving me the webp image. I need to know at the point of getting the image from the site if I'm being given the jpg or webp version so I can save it appropriately and then work out what to do with it.
I'm planning to convert these images using sharp when I know what extension I've actually got
So I guess a few questions are;
- Is it possible to force puppeteer to NOT serve me webp format and give me just jpg? OR
- Is it possible when extracting the image to see what type it actually is before I save it so I know what extension to save it as?
- Is it possible for sharp to identify the image type before I try to convert?
Thanks, Dan
Looks like puppeteer allows you to set a user agent. If I set this to a browser that does not support webp images, I'm given the jpg images by default