I am splitting my string based on two delimiters so far, but I would like to extend this to a possibility where the number of delimiters is variable. Right now, I have this function:
void dac_sim::dac_ifs::dac_sim_subcmd_if::parse_cmd(std::string command, std::array<std::string, 2> delimiters)
{
std::string str = command;
std::vector< std::string > vec;
auto it = str.begin(), end = str.end();
bool res = boost::spirit::qi::parse(it, end,
boost::spirit::qi::as_string[ *(boost::spirit::qi::char_ - delimiters[0] - delimiters[1]) ] % (boost::spirit::qi::lit(delimiters[0]) | boost::spirit::qi::lit(delimiters[1])),
vec);
std::cout << "Parsed:";
for (auto const& s : vec)
std::cout << " \"" << s << "\"";
std::cout << std::endl;
}
But now I want something more generic, via template for the array size, like this:
template <size_t N>
void dac_sim::dac_ifs::dac_sim_subcmd_if::parse_cmd(std::string command, std::array<std::string, N> delimiters)
In this case, how can I procceed?
Fold Expressions
Can you use c++17? I'd use fold-expressions:
Test it Live On Coliru
Prints
But You Need Arrays?
You can always use the index-sequence trick to transform into a parameter pack:
Where
do_parse_cmdis the function just shown above. Let's demo with";"added as a third delimiter: Live On ColiruPrints
Problems
Versions
For one, the above requires c++17 for the fold-expressions, and the demos also liberally use c++20 features to make it all easy to demonstrate. If you don't have that, even the c++17 version will become a lot more tedious.
Semantic problems
There's an issue when the caller passes delimiters in a sub-optimal way. E.g.,
{":", ":|:"}won't work, but{":|:", ":"}will. That's because of the overlapping pattern. You would want to be smarter.Flexibility
You might want to be able to have full-blown parser expression capability instead of fixed string literals. Let me postpone this for later
Qi Symbols
To support c++11 and solve the semantic issue, let's use
qi::symbols:This internally builds a Trie so the order in which delimiters are passed doesn't matter. The longest possible match will always match a single
delimexpression.With the same test: Live On Coliru (c++11)
Future Proofing
To be completely flexible and compose the parser from any parser expression, you would have to thread the needle in Qi, and get considerable compile times:
Suffice it to say, I won't recommend it. However, using X3¹ none of this is hard, and you could easily achieve it
Identical X3 version
Live On Coliru. 'Nuff said
Generalize (Computer, Enhance!)
Basically replacing
std::stringwithautoin the fold-expression variant:Now you can do funky stuff, like: Live On Coliru
Printing
Summary/TL;DR
Combining parsers in X3 is a joy, and crazy powerful. It will typically still be faster to compile than the Qi parsers.
Note that at no point in this answer did I question why you are reinventing tokenization using a (checks notes) parser generator. Perhaps you should tell me what you're actually building or parsing, and I could give you some real advice on how to use Spirit for great success :)
¹ which is c++14 only and will become c++17 only in the future