GOLD Parser comment grammar

1.6k views Asked by At

I'm having some trouble with comment blocks in my grammar. The syntax is fine, but Step 3 DFA scanner is complaining about the way I'm going about it.

The language I'm trying to parse looks like this:

{statement}{statement} etc.

Within each statement can be a couple of different types of comments:

{% This is a comment.
It can contain multiple lines
and continues until the statement end}

{statement  REM This is a comment.  
It can contain multiple lines  
and continues until the statement end}

This is a simplified grammar that displays the problem I'm running into:

"Start Symbol" = <Program>

{String Chars} = {Printable} + {HT} - ["\]
StringLiteral = '"' ( {String Chars} | '\' {Printable} )* '"'

Comment Start = '{%'
Comment End = '}'
Comment Block @= { Ending = Closed }  ! Eat the } and produce an empty statement
!Comment @= { Type = Noise }  !Implied by GOLD

Remark Start = 'REM'
Remark End = '}'
Remark Block @= { Ending = Open }  ! Don't eat the }, the statements expects it
Remark @= { Type = Noise }

<Program> ::= <Statements>
<Statements> ::= '{' <Statement> '}' <Statements> |  <>
<Statement> ::= StringLiteral

Step 3 is complaining about the } in <Statements> and the } for the End of the lexical group.

Anyone know how to accomplish what I need?

[Edit]
I got the REM portion working with the following:

{Remark Chars} = {Printable} + {WhiteSpace} - [}]
Remark = 'REM' {Remark Chars}* '}'
<Statements> ::= <Statements> '{' <Statement> '}'
              |  <Statements> '{' <Statement> <Remark Stmt>
              |  <>
<Remark Stmt> ::= Remark

This is actually ideal, since Remarks are not necessarily noise to me.

Still having issues with the comment lexical group. I'll look at solving in the same way.

1

There are 1 answers

4
Jesper Sandgaard Sørensen On

I don't think capturing the REM comment with a lexical group is possible.

I think you need to define a new terminal like this:

Remark = 'REM' ({Printable} - '}')*

This however means, that you need to be able to handle this new terminal in your productions...

Eg. From:

<CurlyStatement> ::= '{' <Statement> '}'

To:

<CurlyStatement> ::= '{' <Statement> '}'
                   | '{' <Statement> Remark '}'

I have'nt checked the syntax in the above examples, but I hope you get my idear