I have a data file format which includes
- /* comments */
- /* nested /* comments */ too */ and
- // c++ style single-line comments..
As usual, these comments can occur everywhere in the input file where normal white space is allowed.
Hence, rather than pollute the grammar proper with pervasive comment-handling, I have made a skipper parser which will handle white space and the various comments.
So far so good, and i am able to parse all my test cases.
In my use case, however, any of the parsed values (double, string, variable, list, ...) must carry the comments preceding it as an attribute, if one or more comments are present. That is, my AST node for double should be
struct Double {
double value;
std::string comment;
};
and so forth for all the values I have in the grammar.
Hence I wonder if it is possible somehow to "store" the collected comments in the skipper parser, and then have them available for building the AST nodes in the normal grammar?
The skipper which processes comments:
template<typename Iterator>
struct SkipperRules : qi::grammar<Iterator> {
SkipperRules() : SkipperRules::base_type(skipper) {
single_line_comment = lit("//") >> *(char_ - eol) >> (eol | eoi);
block_comment = ((string("/*") >> *(block_comment | char_ - "*/")) >> string("*/"));
skipper = space | single_line_comment | block_comment;
}
qi::rule<Iterator> skipper;
qi::rule<Iterator, std::string()> block_comment;
qi::rule<Iterator, std::string()> single_line_comment;
};
I can store the commments using a global variable and semantic actions in the skipper rule, but that seems wrong and probably won't play well in general with parser backtracking. What's a good way to store the comments so they are later retrievable in the main grammar?
Good thinking. See Boost Spirit: "Semantic actions are evil"?. Also, in your case it would unnecessarily complicate the correlation of source location with the comment.
You cannot. Skippers are implicitly
qi::omit[]
(like the separator in the Kleene-% list, by the way).There you have it: your comments are not comments. You need them in your AST, so you need them in the grammar.
Ideas
I have several ideas here.
You could simply not use the skipper to soup up the comments, which, like you mention, is going to be cumbersome/noisy in the grammar.
You could temporarily override the skipper to just be
qi::space
at the point where the comments are required. Something likeOr given your AST, maybe a bit more verbose
Notes:
double_
,string_
andint_
are declared withqi::space_type
as the skipper (see Boost spirit skipper issues)comment_
rule is assumed to expose astd::string()
attribute. This is fine if used in the skipper context as well, because the actual attribute will be bound toqi::unused_type
which compiles down to no-ops for attribute propagation.A fancy solution might be to store the souped up comment(s) into a "parser state" (e.g. member variable) and then using
on_success
handlers to transfer that value into the rule attribute on demand (and optionally flush comments on certain rule completions).Effectively this is a hybrid: yes you use semantic actions to "side-channel" the comment values. However, it's less unwieldy because now you can deterministically "harvest" those values in the on-success handler. If you don't prematurely reset the comments, it should even generically work well under backtracking.
A gripe with this is that it will be slightly less transparent to reason about the mechanics of "magic comments". However, it does sit well for two reasons:
I think option 2. is the "straight-forward" approach that you might not have realized. Option 3. is the fancy approach, in case you want to enjoy the greater genericity/flexibility. E.g. what will you do with
Or, what about
These would be easier to deal with correctly in the 'fancy' case.
So, if you want to avoid complexity, I suggest option 2. If you need the flexibility, I suggest option 3.