Base64 Encode giving me different value for c# and java

103 views Asked by At

I have been trying to get same encoding in my C# code which I am getting from my Java code.

In Java this is what I am doing:

    JSONObject claimSet = new JSONObject();
    claimSet.put("partnerUrl", "https://test.com/testapply/abc/signup");

    String claimSetJson = claimSet.toString();
    byte[] claimSetJsonBytes = claimSetJson.getBytes(StandardCharsets.UTF_8);
    
    String claimSetBase64 = 
    Base64.getEncoder().encodeToString(claimSetJsonBytes);

and the calimSetBase64 value is eyJwYXJ0bmVyVXJsIjoiaHR0cHM6XC9cL3Rlc3QuY29tXC90ZXN0YXBwbHlcL2FiY1wvc2lnbnVwIn0=

And the equivalent code I have written in c# is:

        var claimSets = new Dictionary<string, object>()
            {
                { "partnerUrl", "https://test.com/testapply/abc/signup" },
            };
        string claimSetsJson = JsonSerializer.Serialize(claimSets);
        byte[] claimSetsjsonBytes = Encoding.UTF8.GetBytes(claimSetsJson);
        var claimSetsBase64 = Convert.ToBase64String(claimSetsjsonBytes);

and now claimSetsBase64 values is eyJwYXJ0bmVyVXJsIjoiaHR0cHM6Ly90ZXN0LmNvbS90ZXN0YXBwbHkvYWJjL3NpZ251cCJ9

now if I decode though both of the encoded strings giving me exactly same value, I am expecting the encoded string also should be same, what I am missing here? I have seen similar question asked in Here but according to the accepted answer I am already using UTF8 on both Java and c#.

2

There are 2 answers

2
Oleg Spiridonov On

If you decode your Base64 strings, you will see different original strings.

In Java code you try to encode string: {"partnerUrl":"https:\/\/test.com\/testapply\/abc\/signup"}.

In C#: {"partnerUrl":"https://test.com/testapply/abc/signup"}

So, string from Java code contains backslash \ as escape character.

0
Reilas On

To elaborate on the comments from the main question.

As mentioned by Oleg Spiridonov, if you decode the data you'll find that the Java implementation is escaping the solidus character.

{"partnerUrl":"https:\/\/test.com\/testapply\/abc\/signup"}
{"partnerUrl":"https://test.com/testapply/abc/signup"}

There is nothing in the RFC for JSON that mentions escaping the solidus specifically; only the quotation mark, the reverse solidus, and control characters U+0000 through U+001f.
RFC 8259: The JavaScript Object Notation (JSON) Data Interchange Format.

If you look at the source for JSONObject, specifically the JSONStringify#string method, you'll find they escape all of these mentioned values, including the solidus, on line 316.

Additionally, there is an Errata in regard to the solidus, in terms of it's representation within a string.
RFC Errata Report.

Finally, if you look at the examples within RFC 8259, you'll find they are not escaping the solidus.

It is my conclusion that, the solidus being escaped is part of an errata, thus was implemented by the JSONObject class, unknowningly.

My suggestion is to implement a method similar to the JSONStringify#string method in C#.

This will reduce the overhead required when, if, the JSONObject in Java is changed to not escape the solidus.