Named group in regular expression match

204 views Asked by At

I'm trying to parse some source files for some standard information.

The source files could look like this:

// Name: BoltBait
// Title: Some cool thing

or

// Name  :
// Title : Another thing

or

// Title:
// Name:

etc.

The code I'm using to parse for the information looks like this:

Regex REName = new Regex(@"\/{2}\s*Name\s*:\s*(?<nlabel>.*)\n", RegexOptions.IgnoreCase);
Match mname = REName.Match(ScriptText); // entire source code file
if (mname.Success)
{
    Name.Text = mname.Groups["nlabel"].Value.Trim();
}

Which works fine if the field has information. It doesn't work if the field is left blank.

For example, in the third example above, the Title field returns a match of "// Name:" and I want it to return the empty string.

I need help from a regex expert.

I thought the regex was too greedy, so I tried the following expression:

@"\/{2}\s*Name\s*:\s*(?<nlabel>.*?)\n"

However, it didn't help.

3

There are 3 answers

2
Wiktor Stribiżew On BEST ANSWER

You can also use a class subtraction to avoid matching newline symbols:

//[\s-[\r\n]]*Name[\s-[\r\n]]*:[\s-[\r\n]]*(?<nlabel>.*)(?=\r?\n|$)

Note that:

  • [\s-[\r\n]]* - Matches any whitespace excluding newline symbols (a character class subtraction is used)
  • (?=\r?\n|$) - A positive look-ahead that checks if there is a line break or the end of the string.

See regex demo, output:

enter image description here

1
KekuSemau On

\s includes line breaks, which is not wanted here. It should suffice to match tabs and spaces explicitly after :

\/{2}\s*Name\s*:[\t ]*(?<nlabel>.*?)\n

This returns the empty string correctly in your third example (for both name and title).

2
Darryl On

My approach is to use an alternate in a non-capturing group to match the label from the colon to the end of the line. This matches either anything to the end of the line, or nothing.

var text1 = "// Name: BoltBait" + Environment.NewLine + "// Title: Some cool thing" + Environment.NewLine;
var text2 = "// Name  :" + Environment.NewLine + "// Title : Another thing" + Environment.NewLine;
var text3 = "// Title:" + Environment.NewLine + "// Name:" + Environment.NewLine;
var texts = new List<string>() { text1, text2, text3 };

var options = RegexOptions.IgnoreCase | RegexOptions.Multiline;
var regex = new Regex("^//\\s*?Name\\s*?:(?<nlabel>(?:.*$|$))", options );

foreach (var text in texts){
    var match = regex.Match( text );

    Console.WriteLine( "|" + match.Groups["nlabel"].Value.Trim() + "|" );
}

Produces:

|BoltBait|
||
||