OK, so I need to replace all <, & and > plus all non-ascii characters with their html-entity counterparts. I've tried Underscore.string.escapeHTML but that didn't seem to touch the non-ascii chars.
For example I need this:
<div>föö bär</div>
converted into this:
<div>föö bär</div>
Obviously auml and ouml are not enough. I need a valid ascii string no matter what buttons the users chooses to push, or heaven forbit, even writes with some moonspeak keyboard.
I found what you are looking for here
For the purpose of your needs you have to use htmlEncode function.
They define a number of other useful functions within the object:
HTML2Numerical: Converts HTML entities to their numerical equivalents.
NumericalToHTML: Converts numerical entities to their HTML equivalents.
numEncode: Numerically encodes unicode characters.
htmlDecode: Decodes HTML encoded text to its original state.
htmlEncode: Encodes HTML to either numerical or HTML entities. This is determined by the EncodeType property.
XSSEncode: Encodes the basic characters used in XSS attacks to malform HTML.
correctEncoding: Corrects any double encoded ampersands.
stripUnicode: Removes all unicode characters.
hasEncoded: Returns true if a string contains html encoded entities within it.
Source: www.strictly-software.com
Beware of the license agreement - GPL, The MIT License (MIT)