Getting the outer most delimiters in Regular Expressions

263 views Asked by At

Is there a way using regular expressions to get the text in between outermost delimiters? I have a string here and want to get the text in between the outermost {%%% and %%%} delimiters:

Hello {%%%=Select(DepartmentID,1,{%%%=if(Gender="M","Mr.","Ms.")%%%}%%%} {%LastName%}

The text I want to get is:

=Select(DepartmentID,1,{%%%=if(Gender="M","Mr.","Ms.")%%%}

What would the the regular expression for that? I know the text inside does not make much sense, this is just an example.

4

There are 4 answers

3
l'L'l On BEST ANSWER

This pattern will do a positive lookahead:

[^%=]*.{%%%(.+)%%%}.+[^%}]*

capture group:

$1

example: http://regex101.com/r/eG4fV9

EDIT: It seems some people enjoy coming along after an answer was chosen as correct then adding possible scenarios where it won't work. That's fine, however, depending on the circumstances in which something is used can make anything incorrect.

original answer:

(?<={%%%=).+(?=}%%%)[^%]

optional:

[^%=]*.{%%%=(.+)%%%}.+[^%}]*

This will retain the = sign in the matches.

3
Amadan On

In general, if you can't find some unique feature to the delimiters (such as Eugen Rieck noted in comments - it's a good specific solution if he only changed it to non-greedy), the standard regular expressions can't do it.

Some regular expression engines, like Ruby's Oniguruma, can, by using recursive regexps. Something like (off the top of my head):

/{(?<braced>[^{}]*(?:{\g<braced>}[^{}]*)?)}/

Demo

1
jshanley On

try this: /.*?\{\%\%\%(.*)\%\%\%\}.*/

Here's a fiddle

1
zx81 On

Chris, here are two options that match what you're looking for, using recursive regex:

Option 1:

\{((?:[^{}]++|(?R))*)\}

Option 2:

\{(([^{}]*+)(?:(?R)(?2))*)\}

This is PCRE syntax, which works for instance in PHP.

What language and regex engine are you using? These patterns can be adapted to a few other flavors that support recursion.