How to decode a 9 digit integer to some random 4 digit word

1.2k views Asked by At

How to encode a 7 digit integer to a 4 digit string In java?

I have a base36 decoder, which is generating 6 characters,

ex:230150206 is converted to 3T0X1A.

The code for it is as follows:

String f = "230150206";
int d = Integer.parseInt(f.toString());
    StringBuffer b36num = new StringBuffer();
    do {
        b36num.insert(0,(base36(d%36)));
        d = d/ 36;
    } while (d > 36);
    b36num.insert(0,(base36(d)));
    System.out.println(b36num.toString());
    }

    /**
    Take a number between 0 and 35 and return the character reprsenting
    the number. 0 is 0, 1 is 1, 10 is A, 11 is B... 35 is Z
    @param int the number to change to base36
    @return Character resprenting number in base36
    */
    private static Character base36 (int x) {
    if (x == 10) 
        x = 48;
    else if (x < 10)
        x = x + 48;
    else 
        x = x + 54;

    return new Character((char)x);
   }

Can some one share me some other way to achieve this?.

The obtained string can be made in to a substring, but i am looking any other way to do it.

2

There are 2 answers

0
Patricia Shanahan On

Here is a method, in a simple test program. This method allows any String to represent the digits for the result. As the initial print shows, 62 digits should be sufficient to cover all 7 decimal digit numbers with no more than a 4 character output, so I recommend the decimal digits, lower case alpha and upper case alpha for the 7 digit case.

To cover 9 decimal digits in four encoded digits you would need at least 178 characters, which is not possible using only the 7-bit ASCII characters. You would have to decide which additional characters to use as digits.

public class Test {
  public static void main(String[] args) {
    String characters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
    System.out.println(Math.pow(characters.length(), 4));
    testit(230150206, "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ");
    testit(230150206, characters);
  }

  private static void testit(int num, String characters){
    System.out.println(num + " "+compact(num, characters));
  }

  public static String compact(int num, String characters){
    StringBuffer compacted = new StringBuffer();
    while(num != 0){
      compacted.insert(0, characters.charAt(num % characters.length()));
      num /= characters.length();
    }
    return compacted.toString();
  }
}

Output:

1.4776336E7
230150206 3T0X1A
230150206 fzGA6
0
Raniz On

All 7 digit numbers in base 10 can fit inside 24 bits (log2(9999999) < 24), that is 3 bytes. Ascii85 requires 20% of extra space to encode which will make it fit in 4 bytes.

Based on this answer by Mat Banik, you can do this:

public class Ascii85Test {

    static int[] numbers = {
            9999999,
            490,
            7910940,
    };

    public static void main(String[] args) {
        for(int number : numbers) {
            // Convert the number into 3 bytes
            byte[] numberBytes = new byte[3];
            numberBytes[0] = (byte) ((number >> 16) & 0xFF);
            numberBytes[1] = (byte) ((number >> 8) & 0xFF);
            numberBytes[2] = (byte) (number & 0xFF);

            // Ascii85 encode the bytes
            String encoded = Ascii85Coder.encodeBytesToAscii85(numberBytes);
            // The encoded string will be "<4 characters>~\n", so we only need to keep the first 4 characters
            encoded = Ascii85Coder.encodeBytesToAscii85(numberBytes).substring(0, 4);

            // Decode them again, add the trailing ~ that we trimmed
            byte[] decodedBytes = Ascii85Coder.decodeAscii85StringToBytes(encoded + "~");

            // Convert the 3 bytes into a number
            int decodedNumber = ((decodedBytes[0] << 16) & 0xFF0000)
                    | ((decodedBytes[1] << 8) & 0xFF00)
                    | (decodedBytes[2] & 0xFF);

            System.out.printf("%s -> %s -> %s%n", number, encoded, decodedNumber);
        }
    }
}

Output:

9999999 -> R$N4 -> 9999999
490 -> !!2? -> 490
7910940 -> Gd\R -> 7910940

An int in Java can have a maximum of 10 digits (11 if you count the minus sign) and take up 4 bytes. With the 20% overhead of Ascii85 this means that we can encode any integer using 5 characters.