Faster SumSquareDifference in Haskell

Question

Faster SumSquareDifference in Haskell

133 views Asked by CifarettoWiggum At 07 October 2020 at 22:46

I am implementing a fractal image compression algorithm of binary images in Haskell. For this purpose i have to find to a given range block (a sub-image) the closest image in a so called domain pool, a list of lists of images. I am comparing images by calculating the sum square difference of both their pixel values.

I use the Haskell Image Processing (HIP) library for reading and writing images.

compress :: Image VS X Bit -> Int -> [(Int, Int)]
compress img blockSize = zip dIndices tIndices
    where rImg = img
          dImg = downsample2 rImg
          rBlocks = (toBlocks rImg blockSize) :: [Image VS X Bit]
          dBlocks = (toBlocks dImg blockSize) :: [Image VS X Bit]
          dPool = (createDPool dBlocks) :: [[Image VS X Bit]]
          distanceLists = map (\x -> (map.map) (distance x) dPool) rBlocks
          dIndices = map (fst . getMinIndices) distanceLists
          tIndices = map (snd . getMinIndices) distanceLists


distance :: Image VS X Bit -> Image VS X Bit-> Int
distance x y = sumSquareDifference (toBinList x) (toBinList y)
    where toBinList = map (toNum . extractBitOfPixel) . concat . toLists

toLists :: MArray arr cs e => Image arr cs e -> [[Pixel cs e]]
toLists img = [[index img (i, j) | j <- [0..cols img -1]] | i <- [0.. rows img -1]]

extractBitOfPixel :: Pixel X Bit -> Bit
extractBitOfPixel (PixelX b) = b

sumSquareDifference :: [Int] -> [Int] -> Int
sumSquareDifference a b = sum $ zipWith (\x y -> (x-y)^2) a b

The performance of this code is really bad. Compressing a 256x256 image with a block size of 2 takes around 5 minutes despite compiling with -O2. Profiling shows me that most of the runtime is spent in the function distance, especially in sumSquareDifference, but also in toLists and toBinList:

       binaryCompressionSimple +RTS -p -RTS

    total time  =     1430.89 secs   (1430893 ticks @ 1000 us, 1 processor)
    total alloc = 609,573,757,744 bytes  (excludes profiling overheads)

COST CENTRE               MODULE    SRC                                        %time %alloc

sumSquareDifference       Main      binaryCompressionSimple.hs:87:1-63          30.9   28.3
toLists                   Main      binaryCompressionSimple.hs:66:1-90          20.3   47.0
distance.toBinList        Main      binaryCompressionSimple.hs:74:11-79         10.9   15.1
main                      Main      binaryCompressionSimple.hs:(14,1)-(24,21)    7.3    0.0
compress                  Main      binaryCompressionSimple.hs:(28,1)-(36,60)    6.9    0.0
distance                  Main      binaryCompressionSimple.hs:(71,1)-(74,79)    5.7    0.9
compress.distanceLists.\  Main      binaryCompressionSimple.hs:34:38-65          5.2    4.4
compress.distanceLists    Main      binaryCompressionSimple.hs:34:11-74          2.8    0.0
main.\                    Main      binaryCompressionSimple.hs:20:72-128         2.7    0.0
getMinIndices.getMinIndex Main      binaryCompressionSimple.hs:116:11-53         2.7    1.8
sumSquareDifference.\     Main      binaryCompressionSimple.hs:87:52-58          2.7    2.5

Is there a way to improve performance?

A block size of 2 means comparing 16384 range blocks each with 131072 images of the domain pool, so sumSquareDifference will be called (16384*131072=)2147483648 times and calculate each time the sum square difference of two lists with length=4. I realize this is a large number but shouldn't the code be faster anyway (lazy evaluating of lists)? Is this a Haskell problem or an algorithm problem?

Edit:

I was able to at least improve the performance by a third by using:

distance :: Image VS X Bit -> Image VS X Bit-> Int
distance x y
     | x == y = 0
     | otherwise = sumSquareDifference (toBinList x) (toBinList y)
    where toBinList = map (toNum . extractBitOfPixel) . concat . inlinedToLists

Edit 2:

I was able to increase the performance enormously by creating dPool with the function genDistanceList, which stops the calculation as soon as two identical images are found:

genDistanceList :: [[Image VS X Bit]] -> Image VS X Bit -> [[Int]]
genDistanceList dPool rBlock = nestedTakeWhileInclusive (/= 0) $ (map.map) (distance rBlock) dPool

Original Q&A

There are 1 answers

**Daniel Wagner** · Answer 1 · 2020-10-08T03:39:04+00:00

The absolute first thing to try is skipping the conversion to lists:

{-# INLINE numIndex #-}
numIndex :: Image VS X Bit -> (Int, Int) -> Int
numIndex img pos = toNum . extractBitOfPixel $ index img pos

distance :: Image VS X Bit -> Image VS X Bit -> Int
distance a b = sum
    [ (numIndex a pos - numIndex b pos)^2
    | i <- [0 .. cols a-1]
    , j <- [0 .. rows a-1]
    , let pos = (i, j)
    ]

Since you haven't provided us with a minimal reproducible example, it's impossible to tell what effect, if any, that would have. If you want better advice, provide better data.

EDIT

Looking through the haddocks for hip, I suspect the following will be even better still:

distance :: Image VS X Bit -> Image VS X Bit -> Int
distance a b = id
    . getX
    . fold (+)
    $ zipWith bitDistance a b

bitDistance :: Pixel X Bit -> Pixel X Bit -> Pixel X Int
bitDistance (PixelX a) (PixelX b) = PixelX (fromIntegral (a-b))
-- use (a-b)^2 when you switch to grayscale, but for Bit the squaring isn't needed

Here, the fold and zipWith are the ones provided by hip, not base.

TechQA.

Faster SumSquareDifference in Haskell

There are 1 answers

Related Questions in HASKELL

Related Questions in FRACTALS

Related Questions in HIP

Popular Questions

Popular Tags

Trending Questions