I'm writing a PHP script to generate some xml docs and I am having some issues with SimpleXML and quotes.
If I have code like this:
$xml = new SimpleXMLElement('<myxml />');
$xml->addChild('title','My Feed');
$xml->addChild('description','Entity data here - & " '');
If I print_r the $xml obj then I get this:
print_r($xml);
SimpleXMLElement Object
(
[title] => My Feed
[description] => Entity data here - & " '
)
Which seems that once in the object it makes the entities back into their respective characters. However when I call asXML() on the object to get the XML it shows me this:
echo $xml->asXML();
<?xml version="1.0"?>
<myxml>
<title>My Feed</title>
<description>Entity data here - & " '</description>
</myxml>
It make the & back into an entity, but it seems to leave the quotes as characters. Shouldn't it convert them all to entities?
"
and'
are only special characters in XML if they are inside an attribute value. Within the text content of an element, there is no ambiguity as to the meaning of"
or'
, as the next special token being looked for is<
to start an opening or closing tag.So while
<foo bar="hello "world"" />
is invalid XML,<foo>hello "world"</foo>
is not, so no escaping is required.(Just because it's not required, doesn't mean it's not possible, so there may be a fuller answer as to why SimpleXML doesn't at least retain the entities you'd put there voluntarily.)