Purely Applicative Parser using Alternative

871 views Asked by At

In a previous post, a user offered an implementation of a purely applicative parser for Haskell (code originally from here). Below is the partial implementation of that parser:

{-# LANGUAGE Rank2Types #-}

import Control.Applicative (Alternative(..))
import Data.Foldable (asum, traverse_)

The type:

newtype Parser a = Parser {run :: forall f. Alternative f => (Char -> f ()) -> f a}

The instances:

instance Functor Parser where
    fmap f (Parser cont) = Parser $ \char -> f <$> cont char

instance Applicative Parser where
    pure a = Parser $ \char -> pure a
    (Parser contf) <*> (Parser cont) = Parser $ \char -> (contf char) <*> (cont char)

instance Alternative Parser where
    empty = Parser $ \char -> empty
    (Parser cont) <|> (Parser cont') = Parser $ \char -> (cont char) <|> (cont' char)
    some (Parser cont) = Parser $ \char -> some $ cont char
    many (Parser cont) = Parser $ \char -> many $ cont char

Some example parsers:

item = Parser $ \char -> asum $ map (\c -> c <$ char c) ['A'..'z']
digit = Parser $ \char -> asum $ map (\c -> c <$ char (head $ show c)) [0..9]
string s = Parser $ \char -> traverse_ char s

Unfortunately, I'm having a hard time trying to understand how I might use this parser implementation. In particular, I do not understand what Char -> f () should/could be and how I could use this to do simple parsing, e.g. to extra a digit out of an input string. I'd like a concrete example if possible. Could someone please shed some light?

2

There are 2 answers

0
Christopher King On

The Char -> f () part represents matching on a single character. Namely, if you do char 'c', it will match on 'c' and fail on everything else.

To use it, you can convert it to, say Parsec:

convert :: Parser a -> Parsec a
convert p = run p anyChar

p is essentially of the type forall f. Alternative f => (Char -> f ()) -> f a, which specializes to (Char -> Parsec ()) -> Parsec a. We pass in anyChar, and it will produce a Parsec a value by using anyChar and any Alternative operations.

Basically, a Parser a it is a function that, given away to match on a single character, and an Alternative instance, it will produce an Alternative value.

0
Cactus On

In forall f. Alternative f => (Char -> f ()) -> f a, the Char -> f () is something that you are provided with. Your mission, should you choose to accept it, is to then turn that into an f a using only these two bits:

  • The Char -> f () function (i.e. a way to parse a single character: if the next character matches the argument, the parsing succeeds; otherwise it doesn't.)
  • The Alternative instance of f

So how would you parse a single digit into an Int? It would have to be of the form

digit :: Parser Int
digit = Parser $ \parseChar -> _

In _, we have to create an f Int using the kit parseChar :: Char -> f () and Alternative f. We know how to parse a single '0' character: parseChar '0' succeds iff the next character is '0'. We can turn it into a value of Int via f's Functor instance, arriving at

digit0 :: Parser Int
digit0 = Parser $ \parseChar -> fmap (const 0) (parseChar '0')

But f is not just Functor, it is also Alternative, so we can write digit in long-form as

digit :: Parser Int
digit = Parser $ \parseChar -> fmap (const 0) (parseChar '0') <|>
                               fmap (const 1) (parseChar '1') <|>  
                               fmap (const 2) (parseChar '2') <|>  
                               fmap (const 3) (parseChar '3') <|>  
                               fmap (const 4) (parseChar '4') <|>  
                               fmap (const 5) (parseChar '5') <|>  
                               fmap (const 6) (parseChar '6') <|>  
                               fmap (const 7) (parseChar '7') <|>  
                               fmap (const 8) (parseChar '8') <|>  
                               fmap (const 9) (parseChar '9')

And from here, it is merely a matter of pedestrian Haskell programming to cut down on the cruft, arriving at something like

digit :: Parser Int
digit = Parser $ \parseChar -> asum [fmap (const d) (parseChar c) | d <- [0..9], let [c] = show d]

which we can further simplify by noting that fmap (const x) f can be written as x <$ f, giving

digit :: Parser Int
digit = Parser $ \parseChar -> asum [d <$ parseChar c | d <- [0..9], let [c] = show d]