Is there a Scala parsing solution that works on string tokens rather than characters?

177 views Asked by At

I have a document I would like to parse one line at a time. My tokens are entire lines:

Pizza Is Great

12
14
17

red
blue
buckle my shoe

PS. I <3 

I could match the above with a grammar (pseudocode) something like:

text     → /.*/
int      → /[0-9]+/
blank    → /^\s+$/
Document → text + blank + int* + blank + text* + blank + text

What I want is to send each line independently into the parser as a token and try to match it, but every solution I have tried so far (scala-parser-combinators, FastParse, etc.) requires me to tediously define each token with the newline attached in order to break it apart correctly. Clearly I don't actually want my grammar to know about the newlines; they should be used to tokenize the input before it ever hits the parser.

Is there a Scala-compatible parsing solution that can work line-by-line in this way, so that the newlines disappear from my grammar definition entirely? (Could someone show me a simple example?)

0

There are 0 answers