I have a simple html+js page with a <textarea>
.
The user is supposed to paste some text inside it, and the page will use the pasted text as a GET parameter for a remote service url which expects UTF-8.
My current code is like this:
<body>
<textarea id="editor"></textarea>
</body>
<script>
var content = document.getElementById('editor').innerHTML;
content = stripTags(content);
content = decodeHTML(content);
content = encodeURIComponent(content);
var url = remoteServiceBaseUrl + content;
window.open(url, '_blank');
function stripTags(input) {
return input.replace(/<(.|\n)*?>/g, '');
}
function decodeHtml(html) {
var txt = document.createElement("textarea");
txt.innerHTML = html;
return txt.value;
}
<script>
But it has trouble with - for example - %0C
(form-feed
) character found in some Windows-1252
texts: when calling the url I am thrown the error URIError: malformed URI sequence
.
So the question is: how do I convert text to UTF-8, indipendently from the source encoding, with javascript?