I'm building an app where users can post comments. I would like to allow emojis within the comment text and so I have followed the steps in this article and also here to make sure the encoding within my app is utf8mb4 (which is needed for storing and displaying emojis).
Some steps I have taken are as follows:
I've set up my database with collation utf8mb4 and have specified a charset utf8mb4 in the DSN as follows:
"mysql:charset=utf8mb4"
After a comment is submitted, it gets stored into the database and I then retrieve it and insert it to the html.
This works perfect and the emojis are displaying in my app.
however I have also been taking measures to ensure XSS prevention in my app and so I have been using the OWASP ESAPI javascript library to encode any unsafe data before inserting it to the html. So here is my "safe" code for displaying the comment text in the app (after it has come from the database)
var safe_comment_text_for_html = $ESAPI.encoder().encodeForHTML(comment_text);
var individual_comment_html = '<p class="text_of_comment">' + safe_comment_text_for_html + '</p>';
$('#comment_area').html(individual_comment_html);
Unfortunately this is causing the emojis to display as question marks.
Is there another OWASP encoding method I should use specifically for utf8mb4 or what would you advise I do, to display the emojis properly while also ensuring the data is safe before inserting to html? thanks