Decoding a bitString using Huffman code table Java

1.4k views Asked by At

The goal is to convert a bitString into a plain text using Huffman code table,

r=000
h=001
o=01
w=100
d=1010
e=1011
l=11

I stored the Huffman code table in two different String[] arrays:

String[] ch = {"r", "h", "o", "w", "d", "e", "l"};
String[] b = {"000", "001", "01", "100", "1010", "1011", "11"};

According to the Huffman code table, the following bitString is equivalent to the string "helloworld".

String bits = "001101111110110001000111010";

Now I want to loop through each set of bits in order to match up its corresponding character:

StringBuilder sb = new StringBuilder();

for(int i = 0; i < bits.length(); i++) {
    if(bits.substring(0, b[i].length()).equals(b[i])) {
        sb.append(ch[i]);
        bits = bits.substring(b[i].length());
    }
}

The problem here is that each time a match is found, I cannot find the way to "reset" the loop and go back to b[0] so I could check b[i] back from the beginning.

1

There are 1 answers

0
Lothar On

You need to read in the source data "bit" by "bit" and check every time if it now is a valid huffman code. Instead of arrays I suggest using a Map (alternatively you can set up a tree-structure and walk though it step by step), otherwise the performance becomes really slow because you have to go through the whole array for every single bit most of the time.

Here is an example using your huffman code table:

import java.util.HashMap;


public class HuffmanDecoder {
    private static HashMap<String, String> huffMap = new HashMap<>();

    static {
        huffMap.put("000", "r");
        huffMap.put("001", "h");
        huffMap.put("01", "o");
        huffMap.put("100", "w");
        huffMap.put("1010", "d");
        huffMap.put("1011", "e");
        huffMap.put("11", "l");
    }

    public final static void main(String[] args) {
        char[] bits = "001101111110110001000111010".toCharArray();

        StringBuffer result = new StringBuffer();
        StringBuffer temp = new StringBuffer();
        for (int i = 0; i < bits.length; i++) {
            temp.append(bits[i]);
            String val = huffMap.get(temp.toString());
            if (val == null) {
                continue;
            }
            result.append(val);
            temp.setLength(0);
        }

        System.out.println(result);
    }
}

As soon as a valid code is found, the corresponding decoded value is added to the result-buffer and the temporary buffer is resetted. What's missing is some error handling when receiving data that doesn't lead to a valid code value but the implementation of that is dependend of the actual code, e.g. for SFF-data (fax images), the decoder continues reading until it reaches the end-of-line-marker.