Parse DOMNodeList Into Data I can Easily Format in PHP

369 views Asked by At

I have the following DOMNodeList Object being returned from Google and I need to parse through it.

I am parsing it into an array of DOMElement Objects, one for each warning with:

$new_product = _GSC_AtomParser::parse($resp->body);
$elements = $new_product->getWarnings();

$warnings = array();
foreach ($elements as $element):
    $warnings[] = $element;
endforeach;

Then I need to parse these DOMElement Objects to get the warnings:

[0] => DOMElement Object
(
    [tagName] => sc:warning
    [schemaTypeInfo] => 
    [nodeName] => sc:warning
    [nodeValue] => validation/missing_recommendedShoppinggoogle_product_categoryWe recommend including this attribute.
    [nodeType] => 1
    [parentNode] => (object value omitted)
    [childNodes] => (object value omitted)
    [firstChild] => (object value omitted)
    [lastChild] => (object value omitted)
    [previousSibling] => 
    [nextSibling] => (object value omitted)
    [attributes] => (object value omitted)
    [ownerDocument] => (object value omitted)
    [namespaceURI] => http://schemas.google.com/structuredcontent/2009
    [prefix] => sc
    [localName] => warning
    [baseURI] => /home/digit106/dev/public_html/manager/
    [textContent] => validation/missing_recommendedShoppinggoogle_product_categoryWe recommend including this attribute.
)

I want to format this into an array like so:

[warnings] => Array
    (
        [0] => Array
            (
                [domain] => Shopping
                [code] => validation/missing_recommended
                [location] => google_product_category
                [internalReason] => We recommend including this attribute.
            )
    )

But all that data seems to be nested into either the nodeValue or textContent.

How the heck do I parse this out?

2

There are 2 answers

1
Bruno Braga On

Did you try the PHP DOMNodeList class ?

Regards?

0
Rikki On

Unfortunately I don't know of anything that will automatically parse a DOMNodeList into a PHP associative array, though I agree it would be pretty useful.

What you will have to do is traverse the tree yourself to get the elements out, and put them in an associative array. The trick is that "nodeValue" and "textContent" contain a concatenation of the child nodes within this element. What this means is that you need to iterate over the child nodes and extract the information that you need.

$warnings = array();
foreach ($elements as $element):
    $warning = array();
    foreach ($element->childNodes as $child):
        $warning[$child->nodeName] = $child->textContent;
    endforeach;
    $warnings[] = $warning;
endforeach;

Now this is entirely untested, and you will have to adjust it slightly to suit the exact information you need to extract, but should give you the right sort of idea. Primarily $child->nodeName contains the name of the tag (which may not actually be "domain" etc. as you list) and $child->textContent has the text between the tags (or the concatenation of text within child nodes).