PHP Parsing XML to Get All String Values At Multiple Depths

55 views Asked by At

I've got an XML like the following example:

<article>
  <body>
    <title>I <underline><bold>LOVE</bold></underline> Coding</title>
  </body>
</article>

I'd like to get the full text of the node title.

$xml=simplexml_load_file("file.xml");
$title=$xml->xpath('//title')??[];
echo (string)$title;

I cannot seem to go deeper and grab the underlined/bold XML parts.

I would like the results to be a string I LOVE Coding.

How can I achieve this? I am only getting I Coding.

I've also tried xpath('string(//title)') but got an empty result.

3

There are 3 answers

1
Sammitch On BEST ANSWER

SimpleXML is, quite frankly, not a good interface to work with. It's simple as in there's not much to it, but also simple in that there's a fair bit missing, and frequently ends up needing more work anyway.

DomDocument is more full-featured, and by far better to work with, IMO.

$xml = <<<_E_
<article>
  <body>
    <title>I <underline><bold>LOVE</bold></underline> Coding</title>
  </body>
</article>
_E_;

$d = new DomDocument();
$d->loadXml($xml);
$x = new DomXPath($d);
$r = $x->query('//title');

var_dump($r[0]->textContent);

Output:

string(13) "I LOVE Coding"
1
Dirk J. Faber On

You need to strip the tags that are inside <title> like so:

$xmlFile = simplexml_load_file('file.xml');
$title = strip_tags($xmlFile->body->title->asXML());

echo $title;

Or if your XML isn't a file but a string:

$xmlString = '<article>
  <body>
    <title>I <underline><bold>LOVE</bold></underline> Coding</title>
  </body>
</article>';

$xml = simplexml_load_string($xmlString);
$title = strip_tags($xml->body->title->asXML());

echo $title;
0
Olivier On

You can call dom_import_simplexml() to retrieve the DOM element, then use the textContent property:

$xml = <<<XML
<article>
  <body>
    <title>I <underline><bold>LOVE</bold></underline> Coding</title>
  </body>
</article>
XML;

$doc = simplexml_load_string($xml);
$title = $doc->xpath('//title')[0];
$dom = dom_import_simplexml($title);
echo $dom->textContent;

Result:

I LOVE Coding