pegjs: How to handle character class preceded by more general class

94 views Asked by At

I have identifiers that may contain dots but not as the last character. For example, I would like to parse "date.ymd" as identifier but "execute." as (identifier + punctuation character). A regexp would be ([a-zA-z_][a-zA-Z0-9_.]*[a-zA-Z0-9_])|([a-zA-z_][a-zA-Z0-9_]?) How can I accomplish this?

I have been trying:

    program = identifier '.'
    identifier = piv:identifierValue {return {type: 'identifier', value: piv};}
    identifierValue
    = $(identifierHeadLetter (identifierTailLetter / '.')* identifierTailLetter*)
    identifierHeadLetter = letter / '_'
    identifierTailLetter "tail character" = letter / digit / '_'
    digit "digit" = '0' / '1' / '2' / '3' / '4' / '5' / '6' / '7' / '8' / '9'
    letter "letter" = 'a' / 'b' / 'c' / 'd' / 'e' / 'f' / 'g' / 'h' / 'i' / 'j' / 'k' / 'l' / 'm' / 'n' / 'o' / 'p' / 'q' / 'r' / 's' / 't' / 'u' / 'v' / 'w' / 'x' / 'y' / 'z' / 'A' / 'B' / 'C' / 'D' / 'E' / 'F' / 'G' / 'H' / 'I' / 'J' / 'K' / 'L' / 'M' / 'N' / 'O' / 'P' / 'Q' / 'R' / 'S' / 'T' / 'U' / 'V' / 'W' / 'X' / 'Y' / 'Z'

but with "execute." as input I get: { "type": "identifier", "value": "execute." }

If in the rule for identifierValue I change "identifierTailLetter*" to "identifierTailLetter" I get: Line 1, column 9: Expected "." or tail character but end of input found.

I suspect my question is similar to PegJS - match all characters including ) except if ) is the last character but I cannot find a similar solution.

1

There are 1 answers

0
Frans Houweling On

I answer myself. It is as simple as changing the rule for identifierTailLetter to:

identifierTailLetter "tail character"
 = letter / digit / '_' / ('.' &identifierTailLetter)

Thanks and sorry