Regex and capturing parenthesis

484 views Asked by At

I'm trying to understand how capturing parenthesis work for regex but I don't get it...

My code was :

   Pattern pattern = Pattern.compile("ab");
   Matcher m = pattern.matcher("abc");

  while (m.find()) { 
      for (int i = 0; i < m.groupCount(); i++) {
          System.out.println(m.group(i));
      }
  }

so i had no display at all. What i understood was that i need a capturing parenthesis to remember the matching result and to display it.

So i did :

   Pattern pattern = Pattern.compile("(ab)");
   Matcher m = pattern.matcher("abc");

And i had the expected display : ab

Then i wanted to remember and display only a part of the matching result, so i did :

   Pattern pattern = Pattern.compile("(a)b");
   Matcher m = pattern.matcher("abc");

I was expecting to have : a, but i had : ab

why ?

1

There are 1 answers

0
M A On

From the javadocs of Matcher#group():

Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.group(0) is equivalent to m.group().

The problem is that the group zero (i.e. m.group(0)) matches the entire pattern, not the one inside capturing parentheses. So to match the group you want, you need to start at index 1 and end until the group count:

for (int i = 1; i <= m.groupCount(); i++) {
    System.out.println(m.group(i));
}

In your case, you have only one group. But if you have something like Pattern.compile("(a)(b)"), then (a) would match group 1 and (b) would match group 2.