Dynamic String Size

185 views Asked by At

I am trying to decode a base64 encoded string. I wrote the following piece of code for that:

String bytesEncoded = "rO0ABXNyADZ6YS5jby5zYi5wYXltZW50cy50by5pbnN0cnVjdGlvbi5CYXRjaEZpbGVVcGxvYWRD"
+ "UlVEVE8dnJ9z1jdsQwIAB0wAEGFic29sdXRlUGF0aE5hbWV0ABJMamF2YS9sYW5nL1N0cmluZztM"
+ "AAliYXRjaFR5cGVxAH4AAUwAFmN1c3RvbWVyUGF5bWVudFR5cGVLZXl0ABBMamF2YS9sYW5nL0xv"
+ "bmc7TAAhZW5jb2RlZEJ5dGVTdHJlYW1QYXltZW50QmF0Y2hGaWxlcQB+AAFMAAhmaWxlTmFtZXEA"
+ "fgABTAAIZmlsZVR5cGVxAH4AAUwABmlzc3Vlc3QAJEx6YS9jby9zYi9jb3JlL2NvbW1vbi90by9J"
+ "c3N1ZUxvZ1RPO3hyAB96YS5jby5zYi5jb3JlLnRvLkFic3RyYWN0Q1JVRFRP9ka5cD8D+JUCAAJK"
+ "AA12ZXJzaW9uTnVtYmVyTAAGYWN0aW9ucQB+AAF4cP//////////dAAGdXBsb2FkdACYQzpcbkJv"
+ "bF9Mb2FkVGVzdGluZ1xQZXJmb3JtYW5jZVRlc3RpbmdcbkJPTFxDVkFcUmVsZWFzZTE1XE1pc2Mg"
+ "RG9jc1xGaWxlcyBmb3IgVXBsb2FkXG5Cb2wgUHJvcCBSMTMtIERvbWVzdGljIGFkaG9jIDA4MDgy"
+ "MDE0IFRhbnphbmlhQmVuMiBWYWxpZCBQZXJmMi54bWx0AAdQYXltZW50c3IADmphdmEubGFuZy5M"
+ "b25nO4vkkMyPI98CAAFKAAV2YWx1ZXhyABBqYXZhLmxhbmcuTnVtYmVyhqyVHQuU4IsCAAB4cAAA"
+ "AAAAJKWidANsSDRzSUFBQUFBQUFBQU1WVlhXL1RNQlI5UitJL1JIMHZUanNZcmVSbFN0TU5LcUZS"
+ "MnNJUWIzZk9YV3Nwc1lmamxKVmZqeE5ucVowR05DUWtwS3J4T2ZmNmZweHJKL1R5TWMrQ1BhcUNT"
+ "M0V4R0wwS0J3RUtKbE11dGhlRHo1dnI0V1FRRkJwRUNwa1VlREVRY25BWnZYeEJaNkRaTHFKSldX"
+ "aVpvNnJoQ3UrakZXWUlCUVliRUQ5QmNBakNTVGdaaDZQWHdSTFZmVENtNUdRTFhZaENxNUpwVThJ"
+ "eG9rTldUZzRjaG1FNE9zYnBPTklPTnM0bS9URDVGdHR0WGZjVmZpKzV3dlRxRVZsWnNYUFFHRlVW"
+ "RDhPSitWSFM3MEdYU2pJc0NpUFV4d2RiK2tQTDNFQ08wUnJWbmpQOGdIdk1LT2tZSGVjdmtKVVkz"
+ "VWlWZytkbmVVcE9FeVU3VUZzc1pnZ0tWYlIrSHhzMVBJcHVGSWdDT3BJNlpLT01JMlBINkViWUhC"
+ "NHdtbFBTcFdnaVM2RlJMVUhwUTcySm5EQlhqMndIWW91cldqUGl3eWVVU0tHVkNXeGo5TEtmU3Fu"
+ "aEtEN3A0QmE2cG9qR2pGVUZtVzdxZW9sOTJBRVErMGptaTVzeXY2dEVJODY2MmRuQWFYaHUxQXFu"
+ "MC9Fb1BLZkVOOUxGSFZSNW0wZWI4K21rWVJybk5lT2N2cGF6T3QralNrcWx6TTA3UkxQYlpTTzF5"
+ "M3JxUDMrK3c3OGJjdktNSVF1K1JjV2haOVpMT09Rb3pMeWdxQWZSd2Y5MkdLT3o2ZmpOMlg4YWh0"
+ "dTZpWjFXQ2RPbWw3a1J1dENjeGVsT3NxZkdsdEs4UkxNNFRaVzV5YVlodS9qQVJkVzdoOVphSWVw"
+ "R0ZSZllFSWxNNjNlQ0F6YnloMmo4ajh1NlFuVllsM2R6dnVlRnZSbDlaTU1kclczMHRscml0eHR2"
+ "c2RMcW1nc1FqRU5XeWNoMWFjK2lOODVqaEhiVkpHbmE4TkJza1VScjh6ZTdmZmVWa2dyUm1WR2U3"
+ "ZFpTNmRyRkRQN3MvSzJ4K1RRbC9iV1FwdEpGVlh5T0tRZDErR1B4ZFU2YjJldWpvMGV2NENlaS9h"
+ "YW1ubUw4cTAyOHp5R3hIOXBmc2pQKzlLZ0hBQUE9dABDbkJvbCBQcm9wIFIxMy0gRG9tZXN0aWMg"
+ "YWRob2MgMDgwODIwMTQgVGFuemFuaWFCZW4yIFZhbGlkIFBlcmYyLnhtbHBw";

byte[] decodedBytes = Base64.decodeBase64(bytesEncoded);
System.out.println("decodedBytes " + new String(decodedBytes));

The output that i get for this shows as below:

System.out: decodedBytes ’

Now my issue is, i want to know if this is due to variable length. If yes is there a way to increase the length?

I am new to java and did this by searching from internet. Please ignore if i am being very naive.

I tried to just convert a part of the encoded string (just the last line. shown below) and it showed me the string properly.

String bytesEncoded = "YWRob2MgMDgwODIwMTQgVGFuemFuaWFCZW4yIFZhbGlkIFBlcmYyLnhtbHBw";

byte[] decodedBytes = Base64.decodeBase64(bytesEncoded);
System.out.println("decodedBytes " + new String(decodedBytes));

Output for the same below:

System.out: decodedBytes adhoc 08082014 TanzaniaBen2 Valid Perf2.xmlpp
4

There are 4 answers

0
icza On

The decoded bytes simply don't denote a string in the default character encoding (which may vary from platform to platform).

It could be it doesn't even denote a string at all (which seems likely in your case). But if so, you should always specify the encoding explicitly like this:

String s = new String(bytesDecoded, charsetName);
// or
String s = new String(bytesDecoded, charset);

The constructor of String can take any length of a byte array, that is not the problem.

Here's an example that charset matters:

String s = "Hi éáű!";
Systme.out.println(s); // Prints "Hi éáű!" obviously

byte[] b = s.getBytes(StandardCharsets.UTF_8);

// Next line also prints "Hi éáű!", charsets match:
System.out.println(new String(b, StandardCharsets.UTF_8));

// Next line prints "Hi éáű!", decoded with a different charset!
System.out.println(new String(b, StandardCharsets.ISO_8859_1));

Also note that the String constructor does not interpret BOM sequences (e.g. if your decoded byte array starts with a byte order mark sequence, it will not be processed properly but interpreted as bytes of the string).

0
Rick Riemer On

Since the decoded bytes actually contain a binary file (with small text parts), you probably do not mean to treat the decoded bytes as encoded text characters, which is what String(byte[]) will do for you.

Instead you may want to print only the printable characters (by printing out the bytes char-by-char, and testing each char to see if it's displayable), like described in this topic or by displaying it as a string of hexadecimal characters, like described in this topic

3
jguillo On

The byte block you are trying to convert is definitely not a string of text, but binary data of some kind.

It contains multiple 0-valued bytes, which usually denote a string terminator and that's the reason why your original code only shows two characters.

2
Benjamin Marwell On

Use apache commons Codec. It has a Base64 class.

use it like so:

import org.apache.commons.codec.binary.Base64;
[..]
byte[] decoded = Base64.decodeBase64(base64string);

Maven artifact:

<dependency>
    <groupId>commons-codec</groupId>
    <artifactId>commons-codec</artifactId>
    <version>1.9</version>
</dependency>