I created a function that checks if the <title>
tag of an external page contains specific words (between the others of the title). If check is positive it should echo the (whole) page <title>
.
<?php
function file_get_contents_curl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$html = file_get_contents_curl("http://www.lastfm.it/user/lorenzone92/now");
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('title');
$title = $nodes->item(0)->nodeValue;
if (strpos($title,'in ascolto') !== false) {
echo "$title". '<br>';
}
?>
It is working fine. My concern is about memory consumption and server load. The problem is that I cannot cache the $html because it's a live thing.. any idea? Do I need to grab the whole page to just access the <title>
? Other methods instead of cURL and file_get_contents to reduce server load? Or I'm just overconcerned..? :)
Note: Don't worry about PHP version ( no limits, I'm on my VPS which has PHP 5.5.7 installed :D ).
I do not know if it's helpful... but this other question (that seem related to yours) seem to have a lot of answers... here the link
Get title of website via link