I am facing issues while scraping https://www.mca.gov.in/content/mca/global/en/home.html.
Whenever I try to open this in selenium or undetectedchromedriver, it automatically redirects to its home page, while there is no problem if I do it with normal browser.
I tried multiple solution like disabling anchor tags but nothing worked. I just want to stay on the site, I already found the way to solve the captcha.
This is the code I have been running and testing multiple ways:
import undetected_chromedriver as uc
import time
driver = uc.Chrome()
driver.get("https://www.mca.gov.in/content/mca/global/en/mca/master-data/MDS.html")
time.sleep(10)
driver.quit()
It is hard to tell exactly what's going on, but some general tips:
Are you running your Chrome drive in headless mode? Some websites are able to detect this. For example if you go to this website with a headless browser you will see that it detects the browser as being headless. A solution could therefore be to scrape the website with a headful browser
You could also try to use a web scraping API that does all the heavy lifting for you. But those are not free.