I'm doing a simple lexer/parser with boost::spirit.
This is the lexer :
template <typename Lexer>
struct word_count_tokens : lex::lexer<Lexer>
{
word_count_tokens()
{
this->self.add_pattern
("WORD", "[a-z]+")
("NAME_CONTENT", "[a-z]+")
;
word = "{WORD}";
name = ".name";
name_content = "{NAME_CONTENT}";
this->self.add
(word)
(name)
(name_content)
('\n')
(' ')
('"')
(".", IDANY)
;
}
lex::token_def<std::string> word;
lex::token_def<std::string> name;
lex::token_def<std::string> name_content;
};
I defined two identical patterns : WORD and NAME_CONTENT.
This is the grammar :
template <typename Iterator>
struct word_count_grammar : qi::grammar<Iterator>
{
template <typename TokenDef>
word_count_grammar(TokenDef const& tok)
: word_count_grammar::base_type(start)
{
using boost::phoenix::ref;
using boost::phoenix::size;
start = tok.name >> lit(' ') >> lit('"') >> tok.word >> lit('"');
}
qi::rule<Iterator> start;
};
This code works with tok.word in the grammar, but if I replace tok.word by tok.name_content it does not works. But tok.word == tok.name_content.
What is the issue with this code ?
PS : what I want to parse is something like : .name "this is my name"
Update Oh by the way the problem is you can only have one token match - they're matched in order. You /can/ work around this by using lexer states. But I don't recommend this any more than using lexer here in the first place
My suggestion would be to use Qi directly:
My recollection of Lexer token patterns is one of exceedingly confusing escape requirements.
I might try to figure it out later - out of curiosity only
Live On Coliru
Prints