I am confused about why the REPA function computeP packs its result in a monad. It has the following type signature.
computeP :: (Load r1 sh e, Target r2 e, Source r2 e, Monad m) =>
Array r1 sh e -> m (Array r2 sh e)
In this tutorial it says
The reason for this is that monads give a well defined notion of sequence and thus computeP enforces completion of parallel evaluation in a particular point of monadic computations.
Likewise, this answer on Stack Overflow states that
The reason why parallel computation in Repa must be monadic has to do partially with lazyness, but mostly with Repa's inability to deal with nested parallelism. Sequential property of a Monad solves it for the most part[.]
Questions
- What does having this 'sequential property' mean exactly?
- How does a monad enforce this?
For the example of
computeP: there is no constraint on which monad is used, so I could use the identity monad. Is it okay then to use the following function instead that just unpacks the monad, or will this give unexpected results because it lacks this sequential property? If it is okay, is there even a need to use a monad at all?import Data.Functor.Identity import Data.Array.Repa.Eval import Data.Array.Repa myComputeP :: (Load r1 sh e, Target r2 e, Source r2 e) => Array r1 sh e -> Array r2 sh e myComputeP = runIdentity . computeP
Any help would be great.
This monad constraint is a heuristic trick. It helps disciplined users to avoid nested parallelism, but does nothing against malicious or clueless users.
Nested parallelism is the situation where, while computing some array in parallel, you end up having to compute another array in parallel. Repa does not support it (the reason is not important), so it tries to avoid it.
The type of
computePhelps ensure that parallel computations are done in sequence with each other, but it is far from airtight; it is merely a "best effort" abstraction.Actually,
computePis only expected to work with monads whose bind(>>=)is strict in its first argument, so that inu >>= k, the functionkwill be applied only afteruhas been evaluated. Then if you usecomputePwith such a monad,it is guaranteed that the vector
wis evaluated before it gets passed tok, which may safely perform othercomputePoperations.IO, strictState,Maybe,[].Identity, lazyState,Reader. (Lazy monads can be made strict, but not the other way around. In particular, you can define a strict identity monad if you want to do only Repa computations.)To prevent nested parallelism, the type of
computePintentionally makes it cumbersome to use within operations that are likely to be done in parallel, such asmap :: (a -> b) -> Array _ _ a -> Array _ _ bandfromFunction :: sh -> (sh -> a) -> Array _ _ a, which take non-monadic functions. One can still explicitly unwrapcomputeP, for example withrunIdentityas you noticed: you can shoot yourself in the foot if you want, but it's on you to load the gun, point it down and pull the trigger.Hopefully, that answers what is going on in Repa. What follows is a theoretical digression to answer this other question:
Those quotations are quite handwavy. As I see it, the relation between "sequentiality" and "monads" is two-fold.
First, for many monads in Haskell, the definition of
(>>=)naturally dictates an order of evaluation, typically because it immediately pattern-matches on the first argument. As explained earlier, that's what Repa relies on to enforce thatcomputePcomputations happen in sequence (and that's why it would break if you specialize it toIdentity; it is not a strict monad). In the grand scheme of things, this is a rather minor detail of lazy evaluation, rather than anything proper to monads in general.Second, one core idea of pure functional programming is first-class computations, with explicit definitions of effects and composition. In this case, the effect is the parallel evaluation of vectors, and the composition we care about is sequential composition. Monads provide a general model, or interface, for sequential composition. That's another part of the reason why monads help solve the problem of avoiding nested parallelism in Repa.
The point is not that monads have an inherently sequential aspect, but it is that sequential composition is inherently monadic: if you try to list general properties that you would expect from anything worthy of the name "sequential composition", you're likely to end up with properties that together are called "monad"; that's one of the points of Moggi's seminal paper "Notions of computation and monads".
"Monads" is not a magical concept, rather it's enormously general so that lots of things happen to be monads. After all, the main requirement is that there's an associative operation; that's a pretty reasonable assumption to make about sequential composition. (If, when you hear "associativity", you think "monoid" or "category", note that all of these concepts are unified under the umbrella of "monoid objects", so as far as "associativity" is concerned, they're all the same idea, just in different categories.)