encodeURIComponent using ISO-8859-1 encoding for a javascript string

16.5k views Asked by At

I have been trying to get this work but havent got any luck so far. I am not very clear of whats going on but I will try to explain as much as I can. My server side jsp pages are all using ISO-8859-1 encoding which I do not want to change. All the request/responses are in xml form. The POST request currently is using javascript escapeURIComponent function and everything worked well till one has special characters, for example string:hello°world©®™test. When this string is POSTed(with escapeURIComponent to the data part) from IE, and when the page is reloaded which should get the same string, the string is rendered as:hello°world©®™test

I am assuming that this is happening as encodeURIComponent function encodes the string into UTF-8, and not to ISO-8859-1, and when the page renders, the UTF-8 is interpreted as ISO-8859-1 character, and hence showing the string garbled.

Is there any way to solve this without converting the webpages to UTF-8 charset??

The POST request has Content-Type set to "application/x-www-form-urlencoded"

Thanks in advance.

1

There are 1 answers

2
Daniel Martin On BEST ANSWER

First off, I would strongly encourage you just as a general matter of principle to abandon your allegiance to ISO-8859-1 and switch to UTF-8; however, that won't solve your immediate problem, so let's leave that battle for another day.

encodeURIComponent always uses UTF-8. This cannot be changed; though you could manually hack the percent encoding encodeURIComponent produces, I don't think that would be a productive use of anyone's time.

From your description, I would actually place the problem further back: your server thinks that the string has those  characters in it and so is sending back to your browser the necessary code to display those characters. Simply changing the encoding that your server is outputting would just result in your server sending the UTF-8 codes for Â, and not actually help.

So the issue is: how do we tell the server that the incoming data is percent-encoded UTF-8 and not, as the server apparently believes, percent-encoded 8859-1?

You don't specify in your post whether the string you're sending is being sent as part of the URL (that is, you're POSTing to some URL like http://myserver/mypage.jsp?theString=hello%C2%B0world%C2%A9%C2%AE%E2%84%A2test) or as part of the POST body. Normally with a POST you send data as part of the POST body. If that's the case, try adding

<% request.setCharacterEncoding("UTF-8"); %>

to the top of your jsp - that tells the server to interpret incoming requests as being in UTF-8, even if outgoing stuff is still 8859-1. If you have any <form> elements pointing at this page, you should add an accept-charset attribute to the form that says "UTF-8".

If by chance what you're passing is in the URL itself, then you need set the URIEncoding on whatever servlet container you're using; if it's Tomcat, see this question's answer.