Pure conduit for an impure but effectless computation

91 views Asked by At

I am trying to do something like this:

processListIO :: [A] -> IO [B]
processListIO xs = bracket ini fin $ \s -> mapM (upd s) xs
  where
    ini :: IO (Ptr S)
    upd :: Ptr S -> A -> IO B
    fin :: Ptr S -> IO ()

Basically, this is a computation that iterates over a list, maps each element to something else, and uses an internal private state in the process. The specific ini, upd, and fin I have in mind come from a C library, but they are guaranteed to “behave well” in that they just allocate fresh state, perform a computation whose only side-effect is modifying the state, and then deallocate the state. I believe, this means I can safely put unsafePerformIO in front and get a pure function:

processList :: [A] -> [B]
processList = unsafePerformIO . processListIO

Now I would like to do the same, but with conduit (or, actually, any other streaming library). However, since the computation is essentially effectless I would like my conduit to be pure:

processStream :: ConduitT A B Identity ()

or even better:

processStream :: forall m. Monad m => ConduitT A B m ()

(I suspect the latter might make no sense, because it seems that the trick works so well with simple lists only because the elements are pure.)

Ideally, I want to completely hide from the user the fact that the computation needs a state and makes foreign calls, and just pretend that it is merely something like a scanl (or mapAccum, as conduit calls it).

Is this possible? How do I do this with conduit (or some other streaming library)?

1

There are 1 answers

0
Li-yao Xia On

However, since the computation is essentially effectless I would like my conduit to be pure:

processStream :: ConduitT A B Identity ()

You shouldn't do this because it breaks referential transparency. The conduit you want to construct would consume an A and produce a B right after, and repeat. From that conduit you can construct a function A -> B by restricting it to the first iteration, except that would not be pure, since calling that function again will make a new call to the stateful C library.

The purity of your first function processList comes from the fact that in a single application of that function you know all of the calls you are going to make to the C library. If you try to make it a conduit, you lose that control.

Anyway, unlike pure vs impure functions, there is no benefit to a pure conduit ConduitT A B Identity compared to an impure one ConduitT A B M which makes the required resources explicit.

Instead, I would suggest to wrap your C functions in an abstract monad to prevent users to capture the library state:

-- Safe interface
newtype M a          -- abstract, internally defined as (Ptr S -> IO a)
consume :: A -> M B
runM :: M a -> a     -- safe if the only way to construct (M a) values is using 'consume' and the 'Monad' instance.

Then you can easily wrap consume in a ConduitT A B M Void (use mapM). As far as I can tell, there is no good reason to want that M to be an Identity. Streaming libraries are designed to be usable in a way that's independent of the underlying resource (M or Identity).