Download blocked images with Python

47 views Asked by At

I trying to get the images on a web page with python.

Sometimes I meet web pages that I can't download due to 403 errors, but there are sites that can't be solved by user-agent or Referer, so I'd like some advice on solving this.

My code is as follows.

from selenium import webdriver
import requests
import random

url = "https://regbu.net/20693"
user_agents_list = [
    'Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148'
]
path = "chromedriver.exe"
driver = webdriver.Chrome(path)
driver.get(url)

images = driver.find_elements_by_tag_name('img')

for i, img in enumerate(images, 1):
    img_url = img.get_attribute('src')
    print(i, img_url)
    r = requests.get(img_url, headers={'User-Agent': random.choice(user_agents_list),
                                       'Referer': url})    
    with open("download/{}.jpg".format(i), 'wb') as f:
         f.write(r.content)

If I enter into url(https://regbu.net/20693), can see the images, but if i access image url(https://regbu.net/ftry/wp-content/uploads/2019/05/20693/1.jpg.webp) directly, I get 403 error.

0

There are 0 answers