How to query a DOMNode using XPath in PHP?

196 views Asked by At

I'm trying to get the bing search results with XPath. Here is my code:

$html = file_get_contents("http://www.bing.com/search?q=bacon&first=11");
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHtml($html);
$x = new DOMXpath($doc);
$output = array();
// just grab the urls for now
foreach ($x->query("//li[@class='b_algo']") as $node)
{
    //$output[] = $node->getAttribute("href");
    $tmpDom = new DOMDocument();
    $tmpDom->loadHTML($node);
    $tmpDP = new DOMXPath($tmpDom);
    echo $tmpDP->query("//div[@class='b_title']//h2//a//href");
}
return $output;

This foreach iterates over all results, all I want to do is to extract the link and text from $node in foreach, but because $node itself is an object I can't create a DOMDocument from it. How can I query it?

1

There are 1 answers

0
Jens Erat On

First of all, your XPath expression tries to match non-existant href subelements, query @href for the attribute.

You don't need to create any new DOMDocuments, just pass the $node as context item:

foreach ($x->query("//li[@class='b_algo']") as $node)
{
    var_dump( $x->query("./div[@class='b_title']//h2//a//@href", $node)->item(0) );       
}   

If you're just interested in the URLs, you could also query them directly:

foreach ($x->query("//li[@class='b_algo']/div[@class='b_title']/h2/a/@href") as $node)    
{
  var_dump($node);
}