Implementing try (look-ahead) and untilStop with parser combinators

736 views Asked by At

I am following this tutorial for implementing Parser Combinators (a la parsec) in Haskell. I implemented everything of the NanoParsec mentioned throught this post.

For some hours now, I am struggeling to implement

-- try p. If p fails continue without consuming anything
try :: Parser a -> Parser a
try p = ...

-- Parser for everything, until the character-sequence stop appears. 
-- If stop does not appear at all: fail
untilStop :: String -> Parser String
untilStop stop = ...

My best attempt to implement untilStop looks like somewhat like this and does not quite work

untilStop :: String -> Parser String
untilStop (c : cs) = do
  s <- some $ satisfy (/= d)
  string (c : cs) <|> do
    i <- item
    untilStop (d : ds)
  -- maybe use msum from MonadPlus to combine?

I could not figure out how to combine s, i and the recursive call without failing everthing because of string are not getting everything together.

I think once I have try, untilStop should be straightforward. Can someone point me in the right direction or implement it (try) for me?

Right now I am still learning about Monads, Applicative and related stuff so trying to understand the sourcecode of parsec was impossible for me.

1

There are 1 answers

2
Euge On BEST ANSWER

As I said in a comment, I think you don't need to have a Parsec-like try.

For the untilStop, check this:

untilStop :: String -> Parser String
untilStop [] = everything
untilStop (c:cs) = item >>= fun
  where fun i = do { s <- untilStop cs;
                     if i == c && s == "" then return "" else failure } <|> do
                     s <- untilStop (c : cs)
                     return (i : s)

First, if the stop string is empty, you parse everything. Where everything is:

everything :: Parser String
everything = Parser (\inp -> [(inp,"")])

Otherwise, if it is of the form c:cs, then parse a character i and consider two cases:

  • The stop string is right in the front of the parsing stream (because c == i and parsing with the rest of the string cs gives an empty result), then return "". Or,

  • It is somewhere in the stream, so you look for it further.

Note that the <|> operator is used to backtrack. If untilStop cs fails to be what we want, we need to reparse, using untilStop (c:cs) instead.