I'm working on Okasaki's Purely Functional Data Structures and trying to build F# implementations of things. I'm also going through the exercises listed in the book (some are pretty challenging). Well I'm stuck on exercise 3.4 which calls for modifying the merge function of the WeightBiasedLeftistHeap such that it executes in a single pass as opposed to the original 2 pass implementation.
I haven't been able to figure out how to do this yet and was hoping for some suggestions. There was another post here on SO where a guy does it in SML by pretty much inlining the makeT function. I started out going this route (in the commented section 3.4 First Try. But abandoned that approach because I thought that this really wasn't executing in a single pass (it still goes 'till reaching a leaf then unwinds and rebuilds the tree). Am I wrong in interpreting that as still being a two pass merge?
Here is a link to my complete implementation of WeightBiasedLeftistHeap.
Here are my failed attempts to do this in F#:
type Heap<'a> =
| E
| T of int * 'a * Heap<'a> * Heap<'a>
module WeightBiasedLeftistHeap =
exception EmptyException
let weight h =
match h with
| E -> 0
| T(w, _,_,_) -> w
let makeT x a b =
let weightA = weight a
let weightB = weight b
if weightA >= weightB then
T(weightA + weightB + 1, x, a, b)
else
T(weightA + weightB + 1, x, b, a)
// excercise 3.4 first try
// let rec merge3_4 l r =
// match l,r with
// | l,E -> l
// | E,r -> r
// | T(_, lx, la, lb) as lh, (T(_, rx, ra, rb) as rh) ->
// if lx <= rx then
// let right = merge3_4 lb rh
// let weightA = weight la
// let weightB = weight right
//
// if weightA >= weightB then
// T(weightA + weightB + 1, lx, la, right)
// else
// T(weightA + weightB + 1, lx, right, la)
// else
// let right = merge3_4 lh rb
// let weightA = weight ra
// let weightB = weight right
//
// if weightA >= weightB then
// T(weightA + weightB + 1, rx, ra, right)
// else
// T(weightA + weightB + 1, rx, right, ra)
// excercise 3.4 second try (fail!)
// this doesn't work, I couldn't figure out how to do this in a single pass
let merge3_4 l r =
let rec merge' l r value leftChild =
match l,r with
| l,E -> makeT value leftChild l
| E,r -> makeT value leftChild r
| T(_, lx, la, lb) as lh, (T(_, rx, ra, rb) as rh) ->
if lx <= rx then
merge' lb rh lx la //(fun h -> makeT(lx, la, h))
else
merge' lh rb rx ra //(fun h -> makeT(rx, ra, h))
match l, r with
| l, E -> l
| E, r -> r
| T(_, lx, la, lb) as lh, (T(_, rx, ra, rb) as rh) ->
let lf = fun h -> makeT(lx, la, h)
if lx <= rx then
merge' lb rh lx la // (fun h -> makeT(lx, la, h))
else
merge' lh rb rx ra // (fun h -> makeT(rx, ra, h))
let rec merge l r =
match l,r with
| l,E -> l
| E,r -> r
| T(_, lx, la, lb) as lh, (T(_, rx, ra, rb) as rh) ->
if lx <= rx then
makeT lx la (merge lb rh)
else
makeT rx ra (merge lh rb)
let insert3_4 x h =
merge3_4 (T(1,x,E,E)) h
The first question is: what constitutes a "one-pass" algorithm? Something that could naturally be implemented as a single top-down loop would qualify. In contrast, recursion--compiled naively--normally has two passes, one on the way down and one on the way back up. Tail recursion can easily be compiled into a loop, and usually is in functional languages. Tail recursion modulo cons is a similar, albeit less common, optimization. But, even if your compiler doesn't support tail recursion modulo cons, you can easily convert such an implementation into a loop by hand.
Tail recursion modulo cons is similar to ordinary tail recursion, except that the tail call is wrapped in a constructor, which can be allocated and partially filled in before the recursive call. In this case, you would want the return expressions to be something like
T (1+size(a)+size(b)+size(c),x,a,merge(b,c))
. The key insight required here (as mentioned in the edit on the other SO thread) is that you don't need to perform the merge to know how big the result it is going to be, and therefore which side of the new tree it should go on. This is because the size ofmerge(b,c)
will always besize(b)+size(c)
, which can be calculated outside the merge.Notice that the original
rank
function for ordinary leftist heaps does not share this property, and so cannot be optimized in this fashion.Essentially, then, you inline the two calls to makeT and also convert the calls of the form
size(merge(b,c))
tosize(b)+size(c)
.Once you make this change, the resulting function is significantly lazier than the original, because it can return the root of the result before evaluating the recursive merge.
Similarly, in a concurrent environment involving locks and mutation, the new implementation could support significantly more concurrency by acquiring and releasing locks for each node along the way, rather than locking the entire tree. (Of course, this would only make sense for very lightweight locks.)