Image compression with Huffman tree

896 views Asked by At

So I recently built a Huffman tree and it worked well for encoding text. Now I wanted to try and see if I could compress images aswell. Obviously, images that differ little (like abstract images) should be better to compress. My question is if I'm thinking about this correct. Let's say I got a jpg image that is 786x463 pixels big. Each pixel have 3 color values, which each can be 256 different colors, therefore 8 bits. We got three colors, RGB, so each pixel is effectivly 24 bits all and all. So (786x463) * 24 = 853 40 16 bits which is about 1 Mb.

Now I can loop over all the pixels, and put the three color values in a list, and then count the occurences of each color from that list and remove the duplicates and put them in my tree.

private void saveColors() {
    BufferedImage img = readImage(url);
    int width = img.getWidth();
    int height = img.getHeight();
    int size = (width * height) * 3;
    pixels = new ArrayList<Integer>(size);

    WritableRaster inraster = img.getRaster();
    for (int i = 0; i < width; i++)
        for (int j = 0; j < height; j++) {
            pixels.add(inraster.getSample(i, j, 0));
            pixels.add(inraster.getSample(i, j, 1));
            pixels.add(inraster.getSample(i, j, 2));
        }
    countColors();
}

/**
 * Counts occurences and removes duplicates
 */
private void countColors() {
    List<Integer> duplicateRemoved;
    duplicateRemoved = new ArrayList<Integer>(pixels);

    Set<Integer> hs = new HashSet<Integer>();
    hs.addAll(duplicateRemoved);
    duplicateRemoved.clear();
    duplicateRemoved.addAll(hs);

    pixelFreq = new int[duplicateRemoved.size()];
    pixelArr = new int[duplicateRemoved.size()];

    for (int i = 0; i < pixelArr.length; i++) {
        int p = duplicateRemoved.get(i);
        int count = Collections.frequency(pixels, p);
        pixelArr[i] = p;
        pixelFreq[i] = count;
    }
//Build tree...
}

So I tried to encode this picture : [Just black and white1

And the encoding results gave me this:

color   Freq    Code   bits
'0' -- 4410 -- '0010'--17640
'1' -- 1185 -- '001001'--7110
'2' -- 858 -- '00100'--4290
'3' -- 591 -- '00100101'--4728
'4' -- 315 -- '0010010101'--3150
'5' -- 180 -- '0010010101010'--2340
'6' -- 75 -- '0010001010010'--975
'7' -- 30 -- '00100010100100'--420
'8' -- 9 -- '0010001010010000'--144
'246' -- 3 -- '001000101001000'--45
'247' -- 15 -- '001000101001000'--225
'248' -- 84 -- '0010001010'--840
'249' -- 138 -- '001001010101'--1656
'250' -- 273 -- '0010001010'--2730
'251' -- 480 -- '0010001'--3360
'252' -- 1227 -- '0001'--4908
'253' -- 1749 -- '00010'--8745
'254' -- 2475 -- '00'--4950
'255' -- 1077657 -- '0'--1077657
 Original compressed reduction percent
8734032--1145913---7588119--86.879906%

Which seems resonably, white occurs the most of times and get's the smallest code, where the black get's a slightly larger code. Now as you can see, the total number of bits ended up being 1145913. So we calculate the difference: 8534016 - 1145913 = 7388103 which is about 0,4 Mb. So the difference is about 0,6 Mb, 60% reduction.

So my question is, am I thinking about this right? I mean counting all the color values for each pixel?

0

There are 0 answers