How does the Haskell `do` notation know which value to take when it isn't defined by a return?

408 views Asked by At

I have this monadic object.

data Parser a = Parser (String -> Maybe (a, String))

instance Functor Parser where
  -- fmap :: (a -> b) -> Parser a -> Parser b
  fmap f (Parser pa) =  Parser $ \input -> case pa input of
                                             Nothing -> Nothing
                                             Just (a, rest) -> Just (f a, rest)

instance Applicative Parser where
  pure = return
  (<*>) = ap

instance Monad Parser where
  --return :: a -> Parser a
  return a =  Parser $ \input -> Just (a, input)

  --(>>=) :: Parser a -> (a -> Parser b) -> Parser b
  (Parser pa) >>= f  = Parser $ \input -> case pa input of
                                            Nothing -> Nothing
                                            Just (a,rest) -> parse (f a) rest

And I have this definition of an item which I am told "reads in a character" but I don't really see any reading going on.

item :: Parser Char
item = Parser $ \ input -> case input of ""    -> Nothing
                                         (h:t) -> Just (h, t)

But ok, fine, maybe I should just relax about how literal to take the word "read" and jibe with it. Moving on, I have

failParse :: Parser a
failParse = Parser $ \ input -> Nothing

sat :: (Char -> Bool) -> Parser Char
sat p = do c <- item
           if p c
           then return c
           else failParse

And this is where I get pretty confused. What is getting stored in the variable c? Since item is a Parser with parameter Char, my first guess is that c is storing such an object. But after a second of thought I know that's not now the do notation works, you don't get the monad, you get the contents of the monad. Great, but then that tells me c is then the function

\ input -> case input of ""    -> Nothing
                         (h:t) -> Just (h, t)

But clearly that's wrong since the next line of the definition of sat treats c like a character. Not only is that not what I expect, but it's about three levels of structure down from what I expected! It's not the function, it's not the Maybe object, and it's not the tuple, but it's the left coordinate of the Just tuple buried inside the function! How is that little character working all that way outside? What is instructing the <- to extract this part of the monad?

2

There are 2 answers

4
assembly.jc On BEST ANSWER

As comment mentioned, <- just be do notation syntax sugar and equivalent to:

item >>= (\c->if p c 
              then return c 
              else failParse)

Okay, let see what is c? consider the definition of (>>=)

(>>=) :: Parser a -> (a -> Parser b) -> Parser b

or more readable way:

Parser a >>= (a -> Parser b)

And Now, matches it with above expression item >>= (\c->if p c then return c else failParse) give:

Parer a = item

and

(a->Parser b) = (\c->if p c then return c else failParse) 

and item has type:

item :: Parser Char

so, we can now replace a in (>>=) by Char, gives

Parser Char >>= (Char -> Parser b)

and now \c->if p c then return c else failParse also have type: (Char -> Parser b)

and so c is a Char, and the whole expression can be extended to:

sat p =
item >>= (\c->...) = 
Parser pa >= (\c->...) = Parser $ \input -> case pa input of
                                            Nothing -> Nothing
                                            Just (a,rest) -> parse (f a) rest
                         where f c =  if p c
                                      then return c
                                      else failParse
                               pa input = case input of ""   -> Nothing
                                                       (h:t) -> Just (h, t)
0
Will Ness On

TL;DR: In general, by Monad laws,

do { item }

is the same as

do { c <- item
   ; return c
   }

so it is defined by a return, in a sense. Details follow.


It does take one character out from the input string which is being "read", so in this sense it "reads" that character:

item :: Parser                                   Char
item = Parser $ \ input ->          -- input :: [Char]   
             case input of { ""    -> Nothing
                           ; (h:t) -> Just (h, t)  -- (h:t) :: [Char]
                           }             -- h :: Char    t  :: [Char]

and I bet there's a definition

parse (Parser pa) input = pa input

defined there somewhere; so

parse item input = case input of { ""    -> Nothing 
                                 ; (h:t) -> Just (h, t) }

Next, what does (>>=) mean? It means that

parse (Parser pa >>= f) input = case (parse (Parser pa) input) of
                     Nothing             -> Nothing
                     Just (a, leftovers) -> parse (f a) leftovers

i.e.

parse (item      >>= f) input
      = case (parse  item  input) of
                     Nothing             -> Nothing
                     Just (a, leftovers) -> parse (f a) leftovers
      = case (case input of { "" -> Nothing 
                            ; (h:t) -> Just (h, t) 
                            }) of
                     Nothing             -> Nothing
                     Just (a, leftovers) -> parse (f a) leftovers
      = case input of 
           ""    ->                         Nothing 
           (h:t) -> case Just (h, t) of {
                         Just (a, leftovers) -> parse (f a) leftovers }
      = case input of 
           ""    -> Nothing 
           (h:t) -> parse (f h) t

Now,

-- sat p: a "satisfies `p`" parser 
sat :: (Char -> Bool) -> Parser Char
sat p = do { c <- item           -- sat p :: Parser Char
           ; if p c              -- item  :: Parser Char,  c :: Char
                then return c    -- return c  :: Parser Char
                else failParse   -- failParse :: Parser Char
           }
      = item >>= (\ c ->
                    if p c then return c else failParse)

(by unraveling the do syntax), and so

parse (sat p) input 
 = parse (item >>= (\ c ->
                    if p c then return c else failParse)) input
   -- parse (item >>= f) input
   --  = case input of { "" -> Nothing ; (h:t) -> parse (f h) t }
 = case input of 
       ""    -> Nothing 
       (h:t) -> parse ((\ c -> if p c then (return c)
                                       else failParse) h) t
 = case input of 
       ""    -> Nothing 
       (c:t) -> parse (if p c then (return c) 
                               else failParse) t
 = case input of 
       ""    -> Nothing 
       (c:t) -> if p c then parse (return c) t
                               else parse failParse t
 = case input of 
       ""    -> Nothing 
       (c:t) -> if p c then Just (c, t)
                               else Nothing

Now the meaning of sat p should be clear: for c produced by item (which is the first character in the input, if input is non-empty), if p c holds, c is accepted and the parse succeeds, otherwise the parse fails:

sat p = for c from item:                 -- do { c <- item 
           if p c                        --    ; if p c
                then return c            --        then return c 
                else failParse           --        else failParse }