STArray and stack overflow

602 views Asked by At

I am struggling to understand why the following attempts to find a minimum element in STArray lead to stack space overflow when compiled with ghc (7.4.1, regardless of -O level), but work fine in ghci:

import Control.Monad
import Control.Monad.ST
import Control.Applicative
import Data.Array.ST

n = 1000 :: Int

minElem = runST $ do
  arr <- newArray ((1,1),(n,n)) 0 :: ST s (STArray s (Int,Int) Int)
  let ixs = [(i,j) | i <- [1..n], j <- [1..n]]
  forM_ ixs $ \(i,j) -> writeArray arr (i,j) (i*j `mod` 7927)
--  readArray arr (34,56)  -- this works OK
--  findMin1 arr           -- stackoverflows when compiled
  findMin2 arr           -- stackoverflows when compiled

findMin1 arr = do
  es <- getElems arr
  return $ minimum es

findMin2 arr = do
  e11 <- readArray arr (1,1) 
  foldM (\m ij -> min m <$> readArray arr ij) e11 ixs
      where ixs = [(i,j) | i <- [1..n], j <- [1..n]]

main = print minElem

Note: switching to STUArray or ST.Lazy doesn't seem to have any effect.

The main question though: What would be the proper way to implement such "fold-like" operation over big STArray while inside ST?

3

There are 3 answers

2
Daniel Fischer On BEST ANSWER

The big problem in findMin1 is getElems:

getElems :: (MArray a e m, Ix i) => a i e -> m [e]
getElems marr = do
  (_l, _u) <- getBounds marr      -- Hmm, why is that there?
  n <- getNumElements marr
  sequence [unsafeRead marr i | i <- [0 .. n - 1]]

Using sequence on a long list is a common cause for stack overflows in monads whose (>>=) isn't lazy enough, since

sequence ms = foldr k (return []) ms
        where
          k m m' = do { x <- m; xs <- m'; return (x:xs) }

then it has to build a thunk of size proportional to the length of the list. getElems would work with the Control.Monad.ST.Lazy, but then the filling of the array with

forM_ ixs $ \(i,j) -> writeArray arr (i,j) (i*j `mod` 7927)

creates a huge thunk that overflows the stack. For the strict ST variant, you need to replace getElems with something that builds the list without sequence or - much better - compute the minimum without creating a list of elements at all. For the lazy ST variant, you need to ensure that the filling of the array doesn't build a huge thunk e.g. by forcing the result of the writeArray calls

let fill i j
      | i > n     = return ()
      | j > n     = fill (i+1) 1
      | otherwise = do
          () <- writeArray arr (i,j) $ (i*j `mod` 7927)
          fill i (j+1)
() <- fill 1 1

The problem in findMin2 is that

foldM (\m ij -> min m <$> readArray arr ij) e11 ixs

is lazy in m, so it builds a huge thunk to compute the minimum. You can easily fix that by using seq (or a bang-pattern) to make it strict in m.

The main question though: What would be the proper way to implement such "fold-like" operation over big STArray while inside ST?

Usually, you'll use the strict ST variant (and for types like Int, you should almost certainly use STUArrays instead of STArrays). Then the most important rule is that your functions be strict enough. The structure of findMin2 is okay, the implementation is just too lazy.

If performance matters, you will often have to avoid the generic higher order functions like foldM and write your own loops to avoid allocating lists and control strictness exactly as the problem at hand requires.

1
none On

The problem is that minimum is a non-strict fold, so it is causing a thunk buildup. Use (foldl' min).

Now we add a bunch of verbiage to ignore because stackoverflow has turned worthless and no longer allows posting a straightforward answer.

5
ertes On

That's probably a result of getElems being a bad idea. In this case an array is a bad idea altogether:

minimum (zipWith (\x y -> (x, y, mod (x*y) 7927)) [1..1000] [1..1000])

This one gives you the answer right away: (1, 1, 1).

If you want to use an array anyway I recommend converting the array to an Array or UArray first and then using elems or assocs on that one. This has no additional cost, if you do it using runSTArray or runSTUArray.