SimpleXML with decoded entities

2.6k views Asked by At

How can I make SimpleXML to replace HTML/XML entities with their respective characters, in PHP?

Assume having this XML document, in a string:

$data = '<?xml version="1.0" encoding="ISO-8859-1"?><example>Tom &amp; Jerry</example>'

Obviously, I want SimpleXml to decode &amp; to &. It does not do it by default. I have tried these two ways, neither of which worked:

$xml = new SimpleXMLElement($data);
$xml = new SimpleXMLElement($data, LIBXML_NOENT);

What's the best way to get XML entities decoded? I guess XML parser should do it, I would like to avoid running html_entity_decode before parsing (actually, it won't work either). May this be a problem with the encoding of the string? If so, how could I track and fix it?

1

There are 1 answers

3
hendr1x On

I don't know if this is going to work in some cases but maybe...

$xml = new SimpleXMLElement(html_entity_decode($data));

http://www.php.net/manual/en/function.html-entity-decode.php