Is there a way to do context sensitive parsing in tatsu

Question

Is there a way to do context sensitive parsing in tatsu

93 views Asked by Robin Becker At 03 November 2019 at 09:05

context sensitive '%' ..... eol comments

I'm starting with the grammar for PDF described here

https://github.com/caradoc-org/caradoc/blob/master/doc/grammar/grammar.pdf

which seems to lack the definition of eol comments.

PDF has end of line comments which start with the '%' character except inside string_literal (and another rule stream).

string_literal = "(" string_content ")";

where string_content can include the '%' character and also eol, but not "()" etc. The PDF language also has some special cases which otherwise look like comments eg

'%PDF-1.5' eol;

or

"%%EOF" [eol];

is there a way to handle the context sensitivity in a tatsu grammar?

Original Q&A

There are 1 answers

**Apalala** · Answer 1 · 2019-11-04T00:31:39+00:00

I'll stay away from "Context Sensitive" in this answer, because the phrase has meaning in Language Theory.

PEG is perfectly capable of parsing a sub-language (say, Python string formatting expressions) within another language.

In fact, the original PEG definition does not use a tokenizer, because PEG grammars can parse the token sub-language.

If you think of sub-grammars, then the context is provided by the rule that knows that a sub-grammar has to be invoked.

With TatSu, there are features that allow tokenization to happen before the parsing (the Buffer class) for efficiency, and convenience, but using those features is not mandatory.

The only cases that cannot be handled easily as a grammar-within-a-grammar are preprocessing with macro capabilities, because those require an interpretation phase before the text for the inner grammar can be parsed.

TechQA.

Is there a way to do context sensitive parsing in tatsu

There are 1 answers

Related Questions in TATSU

Popular Questions

Popular Tags

Trending Questions