sanitize-html changing ampersand in URL

145 views Asked by At

I am using sanitize-html to sanitise html input before saving to a database.

The following:

<link href="https://fonts.googleapis.com/css2?family=Questrial&display=swap" rel="stylesheet">

Is returned as:

<link href="https://fonts.googleapis.com/css2?family=Questrial&amp;display=swap" rel="stylesheet" />

The & in the href value has been changed to &amp;, which is ignored by some browsers and therefore changes the destination of the URL:

https://fonts.googleapis.com/css2?family=Questrial&amp;display=swap

!=

https://fonts.googleapis.com/css2?family=Questrial&display=swap

Per the warnings here I don't want to change the decodeEntities setting and I would want & to be encoded as &amp; in text content.

Is there a setting in sanitize-html that lets me (safely) preserve URLs? Or an alternative approach which doesn't involve re-parsing the HTML after it has been sanitised?

Edit

This is a documented behaviour of the library. Per this page, the browser will use the correct value when a link is clicked. I assume this is also the case for href attributes that aren't clicked (e.g. in <link> tags), but don't know for sure?

0

There are 0 answers