I'm struggling to correctly encode/decode a JSON string for sending via query string in a GET request.
<html>
<head>
<script type="text/javascript">
function executeRequest(applyUriEncode) {
var json = '{"foo":"⚡&❤很久很久以前"}';
var xmlhttp = new XMLHttpRequest();
xmlhttp.open('GET', 'https://example.com/test.php?json='+(applyUriEncode ? encodeURIComponent(json) : json), false);
xmlhttp.onreadystatechange = function() {
if (xmlhttp.readyState == 4 && xmlhttp.status == 200) {
console.log("applyUriEncode: "+(applyUriEncode ? "true\n" : "false\n"));
console.log(xmlhttp.responseText+"\n");
}
};
xmlhttp.send();
}
</script>
</head>
<body>
<button onClick="executeRequest(true);">Submit encoded</button>
<button onClick="executeRequest(false);">Submit unencoded</button>
</body>
</html>
<?php // test.php
echo $_GET['json'];
Output when clicking Submit encoded and Submit unencoded:
applyUriEncode: true
{"foo":"💀ðŸ•⚡💎&ðŸŽâ¤å¾ˆä¹…很久以å‰"}
applyUriEncode: false
{"foo":"⚡
Desired output is
{"foo":"⚡&❤很久很久以前"}
I need to encode the JSON because otherwise, special characters such as & will break the string. However, the result of encodeURIComponent does not seem decoded correctly by PHP. I tried urldecode on the server side, but that didn't change a thing (output remains the same).
I feel this is a fundamental question, and it should have an answer somewhere here on StackOverflow, but I couldn't find it. I found tons of questions with similar problems, but none led me to a solution for this specific problem.
Edit:
Inspired by the apparently AI-generated reply posted by @Adarsh Pattnaik I played around with ChatGPT a bit myself. After a few attempts, it suggested to add <meta charset="UTF-8"> to the HTML. This did indeed yield the correct output.
However, I don't understand why. The HTML file itself was always encoded as UTF-8. Request and response headers sent/received to/from test.php are (and always were) of content-type text/html; charset=UTF-8 as seen on the network tab of Chrome. Content-type of the HTML is text/html (without charset=UTF-8) and this didn't change when adding the meta-directive.
So what difference does <meta charset="UTF-8"> make that it now yields the correct result?
The issue you're encountering is related to character encoding. When you use
encodeURIComponentin JavaScript, it correctly percent-encodes the JSON string, including the Unicode characters. However, when PHP receives the query string, it does not automatically decode the percent-encoded Unicode characters back to their original form.To fix this, you need to ensure that PHP is interpreting the incoming data as UTF-8 and then use
urldecodeto decode the percent-encoded string. Here's how you can modify your PHP code to achieve the desired output:This code snippet assumes that your PHP environment is configured to use UTF-8 as the default character encoding. If it's not, you might need to explicitly set the character encoding to UTF-8 using
mb_internal_encoding('UTF-8')at the beginning of your script.Additionally, it's important to note that when you're sending JSON data in a query string, you should always use
encodeURIComponentto encode the JSON string. This is because the query string has certain reserved characters (like&,=,+,?, etc.) that can break the structure of the URL if not encoded. TheencodeURIComponentfunction ensures that these characters are safely encoded so that they do not interfere with the URL's format.On the client side, your JavaScript code is correct in using
encodeURIComponentwhen setting theapplyUriEncodeflag totrue. Always use the encoded version for sending data in a query string to avoid issues with special characters.