Parsing single line and multi line comments with Arpeggio

Question

Parsing single line and multi line comments with Arpeggio

46 views Asked by Stefano Bragaglia At 09 February 2024 at 19:02

I'm trying to use Arpeggio to parse file containing single line and multi line comments.

Arpeggio's documentation suggests to have a look at their "simple" example to see how to deal with them (see documentation and linked code). The example indeed includes the following definition:

def comment():          return [_(r"//.*"), _(r"/\*.*\*/")]

which is used by the parser as follows:

parser = ParserPython(simpleLanguage, comment, debug=debug)

Unfortunately, however, their example doesn't contain any comment so it's not really possible to see how it works. If I add the following dummy comments to the example:

/*
This is a multi-line comment.
*/
// This is a single-line comment.
function fak(n) {
...

then the following exception is raised:

arpeggio.NoMatch: Expected '//.*' or keyword at position (1, 1) => '*/* This is'.

which seems to suggest the example file doesn't match the comment rule nor keyword function that is the first token that the production of simpleLanguage allows.

Does anyone know how we are supposed to deal with comments?

Please find below a MRE if it helps debugging the problem:

from __future__ import unicode_literals

import os

from arpeggio import *
from arpeggio import RegExMatch as _


def comment():  return [_(r"//.*"), _(r"/\*.*\*/")]
def document(): return Kwd("hello"), _(r"[a-z]+"), '!', EOF


def main(filename, debug=False):
    current_dir = os.path.dirname(__file__)
    content = open(os.path.join(current_dir, filename), "r").read()
    parser = ParserPython(document, comment, debug=debug)
    parse_tree = parser.parse(content)


if __name__ == "__main__":
    main('simple.ex', debug=True)

and the content of the file to parse:

/*
This is a multi-line comment.
*/
// This is a single line comment.
hello world!

Original Q&A

There are 1 answers

**Balduin Scheffbuch** · Accepted Answer · 2024-02-09T20:30:43+00:00

Balduin Scheffbuch On 09 February 2024 at 20:30 BEST ANSWER

Your version of the comment() method does not handle line breaks very well. It might work if you try to adjust it like this, to include both whitespace and non-whitespace characters:

def comment():  return [_(r"//.*"), _(r"/\*[\s\S]*?\*/")]

TechQA.

Parsing single line and multi line comments with Arpeggio

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in PEG

Related Questions in ARPEGGIO

Popular Questions

Popular Tags

Trending Questions