I have a spider that is getting cookies from a site in the first few steps. I would like to get the cookies, start the scrape, and if the HTTP status of the current request == 302, I want to loop back to the cookies part to refresh them. How can I log the HTTP status as a variable in scrapy shell, to add in an "if http_status ==302, break and go back to step 1"? Thank you!
Related Questions in WEB-SCRAPING
- Using Puppeteer to scrape a public API only when the data changes
- Scraping information in a span located under nested span
- How to scrape website which loads json content dynamically?
- How can I find a button element and click on it?
- WebScraping doesnt work, even without error
- Need Help Extracting Redirect URL from a div Element with Specific Class Name in Python Selenium
- beautifulsoup library not showing below #document data inside iframe tag in python
- how to create robust scraper for specific website without updating code after develop?
- Optimizing Selenium script for faster execution
- Parse Dynamic Power BI table with selenium
- How to extract table from webpage that requires click/toggle?
- SSL Certificate Verification Error When Scraping Website and Inserting Data into MongoDB
- Scraping all links using BeautifulSoup
- How do I make it so all arrays are the same length?
- I am getting 'NoneType object is not subscriptable' error in web scraping method
Related Questions in SCRAPY
- pagination, next page with scrapy
- Scraping Text through sections using scrapy
- How to access Script Tag Variables From a Website using Python
- xpath issue in nested div
- How to fixed Crawled (403) forbbiden in scrapy?
- Cannot set LOG_LEVEL when using CrawlerRunner
- Scrapy handle closespider timeout in middleware
- Scrapy CrawlProcess is throwing reactor already installed
- Scrapy playwright non-headless browser always closing
- why can't I retrieve the track of my Spotify playlist even i have given correct full xpath
- Scrapy - how do I load data from the database in ItemLoader before sending it to the pipeline?
- Scrapy Playwright Page Method: Prevent timeout error if selector cannot be located
- Why scrapy shell did not return an output?
- Python Scrapy Function that does always work
- Scrapy / extracting data across multiple HTML tags
Related Questions in SCRAPY-SHELL
- How to select specific class with Scrapy
- scrapy parse function not getting called- Also bulk saving
- Cannot find html element using css or xpath selectors in Scrapy
- Why is scrapy shell returning an empty list when my XPath selector works as it should in the “Elements” tab of my Chrome browser?
- Getting error when sending request to a website using Scrapy shell
- I cant open scrapy shell at anaconda shell
- Why is response.xpath('') not printing anything?
- Scrape the feature image from this website but it returns this `data:image/gif
- Scrapy: Parsing data for one variable directly from the start url and data for other variables after following all the href from the start url?
- Why do I get a empty list in scrapy when I use response.css
- Scrapy Playwright removed cookie when using proxy
- Unable to extract div html conent in scrapy python
- How can I use scrapy middleware in the scrapy Shell?
- scrapy splash css selector not getting data
- Scrapy shell with playwright
Related Questions in CRAWLERA
- How to authenticate using scrapy spider with Zyte Smart Proxy Manager (former Crawlera) enabled?
- Scrapy Cloud skipping through loop
- Scrapy crawlera authentication issue
- Crawlera & Puppeteer - Problem with Authenticantion in HTTPS
- Set country while scraping Amazon
- get https response from scrapy shell
- How to resolve 502 response code in Scrapy request?
- How to use crawlera proxies in selenium
- Downloading Images from list of URLs (Scrapy sends 2 requests per url)
- Use a specific Scrapy downloader middleware per request
- 504 Timeout Exception when using scrapy-splash with crawlera
- Website redirects endlessly until the max-redirection is reached in scrapy
- Scrapy crawlera bug
- Crawlera, cookies, sessions, rate limiting
- Why is scrapy with crawlera running so slow?
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
I'm an idiot. If anyone comes across this, all you have to do it set your variable (in my case http_response) to response.status. so http_response = response.status returns '200' or whatever depending on the status of the current request. lol solved.