how define default rule in EBNF/Tatsu?

214 views Asked by At

I have a problem in my EBNF and Tatsu implementation extract grammar EBNF for Tatsu :

define  ='#define' constantename [constante] ;
constante = CONSTANTE ;  
CONSTANTE = ( Any | ``true`` ) ;
Any = /.*/ ;  
constantename = (/[A-Z0-9_()]*/) ;

When I test with :

#define _TEST01_ "test01"
#define _TEST_
#define _TEST02_ "test02"

I get :

[
    "#define",
    "_TEST01_",
    "\"test01\""
],
[
    "#define",
    "_TEST_",
    "#define _TEST02_ \"test02\""
]

But I want this :

[
    "#define",
    "_TEST01_",
    "\"test01\""
],
[
    "#define",
    "_TEST_",
    "true"
],
[
    "#define",
    "_TEST02_",
    "\"test02\""
]

Where is my mistake ?

Thanks a lot...

1

There are 1 answers

0
sepp2k On

The problem is that Tatsu skips white space, including newlines, between elements by default. So when you apply the rule '#define' constantename [constante] to the input:

#define _TEST_
#define _TEST02_ "test02"

It first matches #define with '#define', then skips the space, then matches _TEST_ with constantename, then skips the newline, and then matches #define _TEST02_ "test02" with ANY (via constante).

Note that that's exactly the behaviour you'd want (I assume) if the newline weren't there:

#define _TEST_ #define _TEST02_ "test02"

Here you'd want the output ["#define", "_TEST_", "#define _TEST02_ \"test02\""], right? At least the C preprocessor would handle it the same way in that case.

So what that tells us is that the newline is significant. Therefore you can't ignore it. You can tell Tatsu to only ignore tabs and spaces (not newlines) either by passing whitespace = '\t ' as an option when creating the parser, or by adding this line to the grammar:

@@whitespace :: /[\t ]+/

Now you'll need to explicitly mention newlines anywhere where newlines should go, so your rule becomes:

define  ='#define' constantename [constante] '\n';

Now it's clear that the constant, if present, should appear before the line break, so for the line #define _TEST_, it would realize that there is no constant.

Note that you'll also want a rule to match empty lines, so empty lines aren't syntax errors.