I have an URL that re-direct to different domain name every time when I open it in a browser. The content of each domain that Im re-directed to is showing ONLY if Im re-directed from that main URL.
In other words: if I open one of all re-directed urls directly in new browser window, it will display me a blank page.

Im trying to create a small bot that can handle these 2 main tasks:
1) To get the domain names that URL is re-derecting;
2) To get the real content of these re-directed domain names;

Im trying to make it with cURL including:

curl_setopt($ch, CURLOPT_COOKIE, 'tmpfile.tmp');
curl_setopt($ch, CURLOPT_COOKIEJAR, 'tmpfile.tmp');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'tmpfile.tmp');

but all that I can get is the following source code of the main URL:

HTTP/1.1 200 OK
Date: Sat, 20 Apr 2019 22:38:21 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
X-Powered-By: PHP/5.4.16
Alt-Svc: h2=":443"; ma=60
Server: cloudflare
CF-RAY: 4caa9baab8cdbd98-AMS



<\title>Loading, please wait...<\title>

window.name = String(Math.floor(Math.random()*101)+100);
if (window.opener) { window.opener = null; }
window.location.replace("/cgi-bin/out.cgi?l=null");

Loading, please wait...


Please help me to make a script that pretend enough to be a regular web site visitor and to be able to collect that data.

This project is for very good cause and any help will be really appreciated!

1 Answers

0
hanshenrik On

I open one of all re-directed urls directly in new browser window, it will display me a blank page.

then you shouldn't re-use cookies, because that's how the website checks if it's the same user with a new browser window, or a brand new browser, but it looks like your code is trying to re-use cookies (it's using a static hardcoded coookie file by the looks of it, if you need a temporary file then use tmpfile() or just keep the cookie in ram)

and that page is a very weird looking and BROKEN javascript-redirector page, either you're not showing the full html of the main url, or the main url is not redirecting anyone anywhere, it doesn't put the redirecting javascript in a <script> tag, and hence the browser will not use it to redirect anywhere.