Appending UTF-8 strings to DOMNode in PHP

45 views Asked by At

I have decided that in my particular project I want to deliver the CSS with the HTML using PHP. My CSS and HTML are collected from various UTF-8 files. The HTML encoding is preserved in the output, but the CSS encoding isn't. I'm getting non-ASCII characters escaped as &#nnnnn;. The odd thing is I can var_dump the CSS string without the &#nnnnn; escapes, but I cannot append it as a DOMText or DOMCDATASection without it escaping the text. I have tried various combinations of htmlentities(), html_entities_decode(), mb_convert_encoding(), mb_detect_encoding(), utf8_encode(), utf8_decode(), createTextNode(), createCDATASection(), but cannot accomplish this simple task. I don't care how the output is encoded (although UTF-8 would be best), so long as the characters in the CSS display as characters.

My original code was something like:

<?php
    $document = new DOMDocument();
    $document->loadHTMLFile("text.html");
    $document->formatOutput = true;
    $xPath = new DOMXpath($document);
    $head = $xPath->query("//html/head")[0];
    $styleElement = $document->createElement("style");
    $styleElement->setAttribute("type", "text/css");
    $styles = $document->createTextNode(file_get_contents("style.css"));
    $styleElement->appendChild($styles);
    $head->appendChild($styleElement);
    echo $document->saveHTML();
?>

HTML:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
    </head>
    <body>
        <p>Put a ✧ before this.</p>
    </body>
</html>

CSS:

@charset "utf-8";

p::before {
    content: '✧';
}

Output:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <style type="text/css">&#65279;p::before {
                content: '&#10023;';
            }
        </style>
    </head>
    <body>
        <p>Put a ✧ before this.</p>
    </body>
</html>

(Where the browser renders: Put a ✧ before this.)

Desired output:

<!DOCTYPE html>
<html>
    <head>
        <meta charset="utf-8">
        <style type="text/css">
            p::before {
                content: '✧';
            }
        </style>
    </head>
    <body>
        <p>Put a ✧ before this.</p>
    </body>
</html>

(Where the browser renders: ✧Put a ✧ before this.)

(I know there are easier methods of accomplishing the above example with echos etc., but using DOM and XPath makes sense in the broader context.)

0

There are 0 answers