I am trying to merge two list in parallel. I have two sorted lists [(i, j, val)]
. Lists are sorted on j
and for same j
, sorted on i
. If the two lists contain the same (i, j)
then their
values are added and combined into one, e.g. if the first list contains (i, j, val_1)
and the second list contains (i, j, val_2)
then combining two will result (i, j, val_1 + val_2)
.
Merging is highly sequential and after searching, I found this paper. The idea from this paper is to use binary search to get the rank of the elements in the final list. Let's say we are at i
th position in the first list so we have (i - 1)
elements smaller than the
current element in first list and perform binary search for this element's position in the second list (say this position is j
). So the position of our current element in final list will be i + j - 1
(i - 1 + j - 1 + 1
). I wrote a Haskell code using dph-par for this but I am kind of stuck with update. I have two list
l_1 = [ (1, 1, 1), (2, 1, 1), (4, 1, 1), (1, 4, 1), (2, 4, 1), (4, 4, 1) ]
l_2 = [ (1, 1, 1), (3, 1, 1), (4, 1, 1), (1, 4, 1), (3, 4, 1), (4, 4, 1) ]
and after updating these two lists , we should have
l_3 = [ (1, 1, 2), (2, 1, 1), (3, 1, 1), (4, 1, 2), (1, 4, 2), (2, 4, 2), (3, 4, 1), (4, 4, 2) ]
Bsearch.hs
{-# LANGUAGE ParallelArrays #-}
{-# OPTIONS_GHC -fvectorise #-}
module Bsearch ( interfaceSparse ) where
import qualified Data.Array.Parallel as P
import Data.Array.Parallel.PArray
import qualified Data.Array.Parallel.Prelude as Pre
import qualified Data.Array.Parallel.Prelude.Int as I
import qualified Data.Array.Parallel.Prelude.Double as D
bSearch :: ( I.Int , I.Int , D.Double ) -> [: ( I.Int , I.Int ,D.Double ) :] -> I.Int
bSearch elem@( i , j , val ) xs = ret where
ret = helpBsearch 0 len where
len = P.lengthP xs
helpBsearch :: I.Int -> I.Int -> I.Int
helpBsearch lo hi
| lo I.>= hi = lo
| cond = helpBsearch ( mid I.+ 1 ) hi
| otherwise = helpBsearch lo mid
where mid = I.div ( lo I.+ hi ) 2
( i' , j' , val' ) = xs P.!: mid
cond = case () of
_| j' I.< j Pre.|| ( j I.== j' Pre.&& i' I.<i ) -> True
| otherwise -> False
bSearchFun :: [: ( I.Int , I.Int , D.Double ) :] -> [: ( I.Int ,I.Int , D.Double ) :] -> [:I.Int :]
bSearchFun xs ys = P.mapP ( \( x , y ) -> x I.+ y ) ( P.indexedP ( P.mapP ( \x -> bSearch x ys ) xs ) )
bSearchMain :: [: ( I.Int , I.Int , D.Double ) :] -> [: ( I.Int , I.Int , D.Double ) :] -> [: ( I.Int , ( I.Int , I.Int , D.Double ) ) :]
bSearchMain xs ys = l_1 where --here change l_2 for second list
lst = [: bSearchFun xs ys , bSearchFun ys xs :]
first = lst P.!: 0
second = lst P.!: 1
l_1 = P.zipP first xs
l_2 = P.zipP second ys
interfaceSparse :: PArray ( Int , Int , Double ) -> PArray ( Int ,Int , Double ) -> PArray ( Int , ( Int , Int , Double ) )
{-# NOINLINE interfaceSparse #-}
interfaceSparse xs ys = P.toPArrayP ( bSearchMain ( P.fromPArrayPxs ) ( P.fromPArrayP ys ) )
Main.hs
module Main where
import Bsearch
import qualified Data.Array.Parallel.PArray as P
import Data.List
main = do
let
l_1 = P.fromList $ ( [ ( 1 , 1 , 1 ) , ( 2 , 1 , 1) , ( 4 , 1 , 1 ) , ( 1 , 4 , 1 ) ,( 2 , 4 , 1 ) , ( 4 ,4 , 1 ) ] :: [ ( Int ,Int , Double ) ] )
l_2 = P.fromList $ ( [ ( 1 , 1 , 1 ) , ( 3 , 1 , 1 ) , ( 4 , 1 , 1) , ( 1 , 4 , 1 ) , ( 3 , 4 , 1 ) , ( 4 , 4 , 1 ) ] :: [ ( Int , Int , Double )] )
e = interfaceSparse l_1 l_2
print e
[ntro@localhost parBsearch]$ ghc -c -Odph -fdph-par -fforce-recomp Bsearch.hs
[ntro@localhost parBsearch]$ ghc -c -Odph -fdph-par -fforce-recomp Main.hs
[ntro@localhost parBsearch]$ ghc -o Bsearch -threaded -rtsopts -fdph-par Main.o Bsearch.o
[ntro@localhost parBsearch]$ ./Bsearch --first list
fromList<PArray> [(0,(1,1,1.0)),(2,(2,1,1.0)),(4,(4,1,1.0)),(6,(1,4,1.0)),(8,(2,4,1.0)),(10 (4,4,1.0))]
[ntro@localhost parBsearch]$ ./Bsearch -- second list
fromList<PArray> [(0,(1,1,1.0)),(3,(3,1,1.0)),(4,(4,1,1.0)),(6,(1,4,1.0)),(9,(3,4,1.0)),(10,(4,4,1.0))]
Could some one please help me with update. I am not sure but this algorithm involves lot of data movement so kindly suggest me something better for this purpose.
I am unfamiliar with the haskell language, but when I merge sorted lists, I use bitonic sort. This is a great algorithm, and highly parallel in design. The only restriction is that you merge lists of size 2^n. I get around this restriction by padding the short lists with values higher than the known values in the lists, so they accumulate together and can be ignored. I have such huge lists to sort that the power of 2 limitation is easy to accommodate.
http://en.wikipedia.org/wiki/Bitonic_sorter http://en.wikipedia.org/wiki/Odd-even_mergesort