I trying to get the images on a web page with python.
Sometimes I meet web pages that I can't download due to 403 errors, but there are sites that can't be solved by user-agent or Referer, so I'd like some advice on solving this.
My code is as follows.
from selenium import webdriver
import requests
import random
url = "https://regbu.net/20693"
user_agents_list = [
'Mozilla/5.0 (iPad; CPU OS 12_2 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148'
]
path = "chromedriver.exe"
driver = webdriver.Chrome(path)
driver.get(url)
images = driver.find_elements_by_tag_name('img')
for i, img in enumerate(images, 1):
img_url = img.get_attribute('src')
print(i, img_url)
r = requests.get(img_url, headers={'User-Agent': random.choice(user_agents_list),
'Referer': url})
with open("download/{}.jpg".format(i), 'wb') as f:
f.write(r.content)
If I enter into url(https://regbu.net/20693), can see the images, but if i access image url(https://regbu.net/ftry/wp-content/uploads/2019/05/20693/1.jpg.webp) directly, I get 403 error.