ByteBuffer to String & VIce Versa diferent result

379 views Asked by At

I have created two helper function(One for ByteBuffer to String & Vice-versa)

public static Charset charset = Charset.forName("UTF-8");
public static String bb_to_str(ByteBuffer buffer, Charset charset){
        System.out.println("Printing start");
        byte[] bytes;
        if(buffer.hasArray()) {
            bytes = buffer.array();
        } else {
            bytes = new byte[buffer.remaining()];
            buffer.get(bytes);
        }
        return new String(bytes, charset);
    }
    
    public static ByteBuffer str_to_bb(String msg, Charset charset){
        return ByteBuffer.wrap(msg.getBytes(charset));
    }

I have a data key that I am encrypting using AWS KMS which is giving my ByteBuffer.

// Encrypt the data key using AWS KMS
ByteBuffer plaintext = ByteBuffer.wrap("ankit".getBytes(charset));
EncryptRequest req = new EncryptRequest().withKeyId(keyId);
req.setPlaintext(plaintext);    
ByteBuffer ciphertext = kmsClient.encrypt(req).getCiphertextBlob();

// Convert the byte buffer to String 
String cip = bb_to_str(ciphertext, charset);

Now the issue is that this is not working :

DecryptRequest req1 = new DecryptRequest().withCiphertextBlob(str_to_bb(cip, charset)).withKeyId(keyId);

but this is working.

DecryptRequest req1 = new DecryptRequest().withCiphertextBlob(ciphertext).withKeyId(keyId);

What is wrong with my code?

1

There are 1 answers

0
Joachim Sauer On BEST ANSWER

The error is trying to convert an arbitrary byte array into a String in bb_to_str(ciphertext, charset);.

ciphertext does not in any reasonable way represent a readable string, and definitely doesn't use the charset that you specify (whichever one it is).

String is meant to represent Unicode text. Trying to use it to represent anything else will run into any number of problems (mostly related to encodings).

In some programming languages the string type is a binary string (i.e. doesn't strictly represent Unicode text), but those are usually the same languages that cause massive encoding confusions.

If you want to represent an arbitrary byte[] as a String for some reason, then you need to pick some encoding to represent it. Common one is Base64 or hex strings. Base64 is more compact and hex string conceptually simpler, but takes up more space for the same amount of input data.