Regular expression, back reference or alternate construct

364 views Asked by At

I am trying to write a regular expression in .NET to capture the whole function from a list of functions that look something like this.

public string Test1()
{
    string result = null;
    foreach(var item in Entity.EntityProperties)
    {
        result +=string.Format("inner string with bracket{0}", "test");
    }
    return result;
}
public string Test5()
{
    return string.Format("inner string with bracket{0}", "test");
}

public string Last()
{
    return string.Format("inner string with bracket{0}", "test");
}

So I got

((?<function>public string (?<fName>\w+)\(\)\s*{.*?})(?=\s*public string))

This will capture all but the last function... or this

((?<function>public string (?<fName>\w+)\(\)\s*{.*?})(?=\s*(public string)|$))

This will match all functions correctly except the first one. The first function is only matched partially.

public string Test1()
{
    string result = null;
    foreach(var item in Entity.EntityProperties)
    {
        result +=string.Format("inner string with bracket{0}", "test");
    } <-- the first capture only get to this point.

Any idea? Please provide some explanation if possible.

2

There are 2 answers

2
firefly On BEST ANSWER

It's actually possible to do it in .NET to check for matching bracket. The key is to use a balancing group. I've heard of it before that's why I ask the question. I just wasn't sure how to write the expression myself so I was hoping that some of the resident reg expert could help me out :)

Luckily I found this website. Which explain balancing group in details... he even provide a template. So here it is for everyone else reference.

http://blog.stevenlevithan.com/archives/balancing-groups the gist of the pattern is here

{
    (?>
        (?! { | } ) .
    |
        { (?<Depth>)
    |
        } (?<-Depth>)
    )*
    (?(Depth)(?!))
}

but check out his blog for the details explanation.

0
Tim Pietzcker On

Although I like regexes a lot, in your case they won't work because nested structures are not "regular" and therefore can't be matched with regular expressions. You need a parser for this kind of job. Sorry.