I'm trying to download all pdf files from the ratp.fr website. However, none of the two (Cyotek and WinHTTrack) website mirroring programs that I tried worked. It seems that, no matter which parameter, user agent I choose or if I decide to ignore robots.txt the answer is still the same: NO. More precisely, with the first soft I get HTTP 403 error, with the second it's an empty copy.
This is related to that particular website: tried on others and it worked perfectly How am I supposed to process to save the website?
Thanks
I tried
- each user agent
- limit to the number of connections
- ignore robots.txt
Each time, it leads to an empty copy and a 403 error