What charset to use for json with base64 encoded binary data?

5.1k views Asked by At

What is the most space efficient charset for JSON (UTF-8/16/32) for use of base64 encoded binary data?

{ data: "jA0EAwMCxamDRMfOGV5gyZPnyX1BB" }
1

There are 1 answers

0
Jordan Running On BEST ANSWER

Base64 is ASCII, so if the bulk of your JSON is Base64-encoded data, the most space-efficient encoding will be UTF-8. UTF-8 encodes ASCII characters (code points 0000–007F) as one byte, whereas UTF-16 and UTF-32 encode them as two and four, respectively.

Furthermore, it's just a good idea to use UTF-8, because it's the default encoding for JSON and not all tools support other encodings. From RFC-7159:

8.1 Character Encoding

JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32. The default encoding is UTF-8, and JSON texts that are encoded in UTF-8 are interoperable in the sense that they will be read successfully by the maximum number of implementations; there are many implementations that cannot successfully read texts in other encodings (such as UTF-16 and UTF-32).