Invalid XML entity references from Jekyll

317 views Asked by At

I'm using Jekyll on GitHub Pages to run my blog.

It seems as though Jekyll (semi-)randomly incorrectly XML escapes an XML special character as &tt;.

As an example, in the current version of the RSS feed, this source XML

</p>
<p>

in a single place becomes

&lt;/p&gt;
&lt;p&tt;

but it should have been

&lt;/p&gt;
&lt;p&gt;

&tt; is an invalid XML entity reference, so some XML parsers choke on that and refuse to go on.

At first I suspected an invisible, invalid character at that place in the source, but as far as I can tell, this isn't the case. What's more is that this behaviour doesn't seem to be consistent:

The RSS feed currently has 7 such errors, of which the above is the first. However, the current Atom feed has only 5 such errors, and they are not in the same places. It's not only <p> tags that are affected, but other tags as well (e.g. <ul> tags should always be escaped as &lt;ul&gt;, but is in a single place instead escaped as &tt;ul&gt;).

Furthermore, when I run

jekyll serve -w

on my local machine, I still see the same type of error, but not in the same places.

The HTML is XML escaped like this:

{{ post.content | xml_escape }}

Why does this happen, and what can I do about it?

1

There are 1 answers

0
parkr On

The only thing xml_escape does is call CGI::escapeHTML, which replaces certain characters with their counterparts. If the bug is present in Jekyll, it's only because it's present in your version of Ruby's CGI module.