Nearley parser grammar for parsing opening and closing tags

Question

Nearley parser grammar for parsing opening and closing tags

504 views Asked by Ryan King At 21 April 2021 at 05:18

Say I had a simple language to parse in nearley that's just made of strings. "this is a string"

string -> "\"" chars "\""

However, that string can contain a code within curly braces. To keep things simple let's just say the code can only be another string."this is a string with {"code"}"

code -> "{" string "}"

How do I define the new string in Nearley to include the code definition? I keep ending up with a huge number of results as chars can match one or more characters.

string -> "\"" charCode "\""

charCode -> (chars | code) charCode
| (chars | code)

code -> "{" string "}"

chars -> char chars
| char
char -> [^{}]

Ideally I'd be able to turn something like this "chars {"code"} chars chars {"code"} chars" into an array ["chars ", "code", " chars chars ", "code", " chars"]

Perhaps it's only possible to do this using regex and moo as suggested in this answer? (The opening and closing tags are less ambiguous in this example, and I'm not experiencing the same issues.) [Nearley]: how to parse matching opening and closing tag

Original Q&A

There are 1 answers

**rici** · Accepted Answer · 2021-04-21T06:52:38+00:00

I'd use a regex-based lexer, certainly. But you could try to write an unambiguous grammar, based on the observation that you can never have two adjacent chars in a charCode:

string -> "\"" charCodeStart chars:? "\""
charCodeStart -> 
               | charCodeStart chars:? code

Another possibility, using EBNF:

string -> "\"" ( char:* code ):* char:* "\""

You'll probably have to play with that a bit to get it right. I don't use nearley much.

TechQA.

Nearley parser grammar for parsing opening and closing tags

There are 1 answers

Related Questions in JAVASCRIPT

Related Questions in PARSING

Related Questions in NEARLEY

Popular Questions

Trending Questions