I'm trying to scrape
newark.com
I have written code, which I have run locally to test it, and it works amazingly!
<?php
$link = 'https://www.newark.com/';
$proxy = ['server' => '172.93.142.42:3128'];
$user_agents = ['Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36', 'Mozilla/5.0 (Linux; Android 8.0.0; H3113 Build/50.1.A.10.40; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/68.0.3440.91 Mobile Safari/537.36 [FB_IAB/FB4A;FBAV/185.0.0.39.72;]', 'Mozilla/5.0 (iPhone; CPU iPhone OS 11_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E302 [FBAN/FBIOS;FBAV/166.0.0.53.95;FBBV/101310068;FBDV/iPhone7,2;FBMD/iPhone;FBSN/iOS;FBSV/11.3.1;FBSS/2;FBCR/vodafoneP;FBID/phone;FBLC/en_GB;FBOP/5;FBRV/102694127]', 'Mozilla/5.0 (Linux; Android 7.0; Studio Mega Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.111 Mobile Safari/537.36 OPR/46.3.2246.127744', 'Mozilla/5.0 (iPhone; CPU iPhone OS 11_2_6 like Mac OS X) AppleWebKit/604.5.6 (KHTML, like Gecko) Mobile/15D100 [FBAN/FBIOS;FBAV/168.0.0.57.90;FBBV/103647182;FBDV/iPhone9,3;FBMD/iPhone;FBSN/iOS;FBSV/11.2.6;FBSS/2;FBCR/MEO;FBID/phone;FBLC/pt_PT;FBOP/5;FBRV/104934021]'];
$user_agent = $user_agents[array_rand($user_agents)];
//$user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36';
$curl_handler = curl_init();
curl_setopt_array($curl_handler, array(
CURLOPT_URL => $link,
CURLOPT_FOLLOWLOCATION => true,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_USERAGENT => $user_agent,
CURLOPT_PROXY => $proxy['server'],
));
$result = curl_exec($curl_handler);
curl_close($curl_handler);
$result = mb_convert_encoding($result, 'UTF-8');
header('Content-type: text/html; charset=utf-8');
echo($result);
However, when I run this code inside of my US servers it does not work.
script execution takes time and nothing happens, nothing appears
But when I change the URL, I put
www.google.com
This script is also working on my servers. I've added proxies to my code but it didn't help with the URL that I need.
I guess it is related to the URL I need, any help?