Combine ST and List monads in Haskell

511 views Asked by At

Using the StateT monad transformer, I can create the type StateT s [] a, which is isomorphic to s -> [(a, s)]. Now I would prefer to use the STT monad transformer instead, as I would like to have multiple mutable variables of different types, and would like to be able to instantiate them at will, depending on the results of earlier computations.

However, the linked documentation for STT mentions explicitly:

This monad transformer should not be used with monads that can contain multiple answers, like the list monad. The reason is that the state token will be duplicated across the different answers and this causes Bad Things to happen (such as loss of referential transparency). Safe monads include the monads State, Reader, Writer, Maybe and combinations of their corresponding monad transformers.

So what are my options?

To be entirely clear:

  • What I'm after, is non-determinism. I want to be able to fork my computation, giving each branch its own copy of the entire state.
  • I don't mind much for parallelism, as performance is not my greatest concern.
  • What I'm not after is concurrency: different branches of computation should not share mutable variables; rather, they should all work on their own copy of the original mutable variable.

EDIT: (Edit to edit: the following counterexample is invalid, as ListT should not be applied to the non-commutative monads ST and State.) I've come to realize that an STT monad transformer that behaves along the lines of StateT is inherently unsafe. With it, we could build a type STT sloc (ListT (ST sglob)) a. Here, sglob is the name of the global state, while sloc is the name of the local state.* Now we can use the global state to exchange local state references between threads, thus potentially obtaining references to uninitialized variables.

*For comparison, the corresponding StateT construction is StateT sloc (ListT (State sglob)) a, which is isomorphic to sloc -> sglob -> ([(a, sloc)], sglob).

2

There are 2 answers

7
leftaroundabout On BEST ANSWER

You won't get around StateT, because for this nondeterminism stuff the compiler needs to know always which “variables” need to be branched out. This is not possible when the variables could be anywhere lurking as STRefs.

To still get “multiple variables of different types”, you'll need to pack them in a suitable record and use that as the single actual state variable. Seems awkward to deal with such a state object? Well, it's not that bad when using lenses to access the “individual variables”.

{-# LANGUAGE TemplateHaskell #-}

import Control.Lens
import Data.Monoid

import Control.Monad.Trans.State
import Control.Monad.ListT
import Control.Monad.Trans.Class
import Control.Monad.IO.Class

data Stobjs = Stobjs {
    _x :: Int
  , _y :: String
  }

makeLenses ''Stobjs

main = runListT . (`runStateT`Stobjs 10 "") $ do
   δx <- lift $ return 1 <> return 2 <> return 3
   xnow <- x <+= δx
   y .= show xnow
   if xnow > 11 then liftIO . putStrLn =<< use y
                else lift mempty

(outputs 12).

“Able to instantiate them at will” is a bit more tricky, because adding variables is only possible through changing the state object, which means you'll not really be in the same monad anymore. Lens has the notion of zooming which could be used – splitting up the state object into “scopes” and using the computations where only some of the variables are defined as zoomed in onto that scope.

To make this really convenient you'd need records that can be extended at will. I really liked Nikita Volkovs record library approach, this doesn't seem to have been advanced any further lately. Vinyl also goes in that direction, but I haven't looked into it much.

In the future, we will have the OverloadedRecordFields extension that will help with this kind of stuff.

3
leftaroundabout On

This answer not recommended, see the other one.


To expand on your idea to wrap up a StateT with a weakly-typed map of variables, this would look something like the following:

{-# LANGUAGE GADTs #-}

import Unsafe.Coerce
import Data.IntMap

data WeakTyped where
   WeakTyped :: a -> WeakTyped

newtype STT' m a = STT' { weakTypState :: StateT (IntMap WeakTyped) m a }
  deriving (Functor, Applicative, Monad)

newtype STT'Ref a = STT'Ref { mapIndex :: Int }

newSTTRef :: Monad m => a -> STT' m (STT'Ref a)
newSTTRef x = STT' $ do
   i <- (+1) . maximum . keys <$> get
   modify $ insert i x
   return $ STT'Ref i

readSTTRef :: Monad m => STT'Ref a -> STT' m a
readSTTRef (STT'Ref i) = STT' $ do
   unsafeCoerce . (!i) <$> get

I'm not convinced that this would actually be smart though. These STT'Refs aren't properly handled by the Haskell runtime, in particular state variables won't be garbage collected. Thus, if you run an action that uses newSTTRef in a loop, it will actually grow the IntMap in each iteration without ever freeing up the variables that have already gone “out of scope” (i.e. don't have any references pointing to them).

It might be possible to add an actual garbage collector to all of this, but it would make it very much more complicated.