Using the PHP v8 preg_match_all($Matches) function's "Match" parameter, I need to match a list of literal text and delimited tokens.
$x = preg_match_all($Regex, $Template, $Matches, PREG_OFFSET_CAPTURE); // Parse the template.
The catch is that tokens should be able to be nested. I need to match only the outermost token from the nest.
Example:
This {is {{Par}m1}} plus {{Par{m3a{{Parm3b}}}} a}nd {{Parm4a||{{Par}m4b||{{Parm4c||{{Parm4d||Parm}}}}}}}}.
Should parse into this:
Match 1: This {is
Match 2: {{Par}m1}}
Match 3: plus
Match 4: {{Par{m3a{{Parm3b}}}}
Match 5: a}nd
Match 6: {{Parm4a||{{Par}m4b||{{Parm4c||{{Parm4d||Parm}}}}}}}}
Match 7: .
Notice above that single curly braces should be allowed in tokens or in text.
Only double curly braces are considered token delimiters.
The regular expression that I have so far is working only if there are no single curly braces in the text or tokens.
My regex:
(?:(?!(\{\{)).)+|((\{\{)((?>[^{}]+|(?2))*)(\}\}))
I cannot figure out how to allow single curly braces in the text or inside tokens without breaking the list of matches.
Any help greatly appreciated!
UPDATE
I am continuing to work on this problem and came up with this:
\{\{(?R)*\}\}|[^{}]+
It uses the recursion operator but it still suffers from the same issue in that single curly braces break the parsing.
The proper delimiter is intended to be opening and closing double-curly-braces "{{" and "}}".
I think I found the solution. So far testing appears to be working.
The regex is
Testing
Parsing this:
Yields this:
So far seems to be working.