I have faced a problem, unfortunately, I have not found a correct solution: I need to decode url-slice that is encoded with windows-1251 (cp1251).
I know there are theese methods - decodeURI() and decodeURIComponent(), but they work for UTF-8 only (as I have understood). A solution that I found uses deprecated methods escape() and unescape().
For example, there is sequence:
%EF%F0%EE%E3%F0%E0%EC%EC%E8%F0%EE%E2%E0%ED%E8%E5 (программирование)
The methods decodeURI() and decodeURIComponent() will cause an exception.
Will be grateful for the help.
There's no built-in support for the percent-encoding scheme with legacy charsets in the browser, as far as I can see. You'll have to:
String
)Below is one way to do it. For the #1 I assume that only 3-character upper-case escapes need decoding, and the rest of the string is already ASCII, so I just use
inputStr.replace(/%([0-9A-Z]{2})/g,
replacerFunction
)
for this.For the actual decoding you need a mapping from the win-1251 octets to JS characters. In the example below I build the mapping using TextDecoder.decode() API, just for fun (and in case someone finds this answer while trying to convert between different charsets in JS). (Note: it isn't universally supported as of this time -- only Gecko/Blink support it).
There's also https://github.com/mathiasbynens/windows-1251 , which I initially wanted to use for this answer, but it turned out to be easier to just build the decoding map by hand.