\d+)\]" /> \d+)\]" /> \d+)\]"/>

How to match a number with a group possibly in square brackets?

82 views Asked by At

I need to grab the number from strings like these:

  • "prefix.[10].suffix
  • "prefix.10.suffix"

My first intuition was to do this:

\w+\.((\[(?<number>\d+)\])|(?<number>\d+))\.\w+

But this fails with an error: "? A subpattern name must be unique".

regex101 link

3

There are 3 answers

2
Shubhankar Kumar On

(\.\[*(.*)\]*\.) this will match with .[10]. or .10.

There will be two groups in the match. Group 1 will be .[10]. and Group 2 will be 10 You can get your required value in group 2.

\. - Expecting a dot

\[* - Expecting 0 or more of [

(.*) - Match with any character

\]* - Expecting 0 or more of ]

\. - Expecting a dot

Basically (.*) this group will capture anything between .[ and ]. where [ and ] are optional. You can replace [ and ] here either with any characterset according to your requirement.

For example if there can be any character except digits in place of [ and ]

(\.[^0-9]*([0-9]+)[^0-9]*\.)
1
rzwitserloot On

Your regex is fine, but, you can't have 2 paren groups and name them both 'number'. I know you want to grab 'whichever one matched' but that's just not how regexes work - you have to fetch group 1 and if that is null, fetch group 2 instead. Name them what you want, or don't name them:

Pattern p = Pattern.compile("\\w+\\.((\\[(\\d+)\\])|(\\d+))\\.\\w+");

Matcher m = p.matcher("prefix.[10].suffix");
if (m.matches()) {
  String raw = m.group(1);
  if (raw == null) raw = m.group(2);
  return Integer.parseInt(raw);
}
0
The fourth bird On

If you are only interested in the number, then you don't need the quantifier + for the word characters \w+ as 1 will also suffice.

Java supports lookaround assertions which would allow you to use an alternation for both scenario's and get a match only without using a group.

(?<=\w\.)\d+(?=\.\w)|(?<=\w\.\[)\d+(?=]\.\w)

The pattern matches:

  • (?<=\w\.)\d+(?=\.\w) Assert a word char followed by a dot to the left, match 1+ digits and then assert a dot and a word char to the right
  • | Or
  • (?<=\w\.\[)\d+(?=]\.\w) Same as the first part, but now with [ and ] in the assertions

Regex demo

In Java with the doubled escapes:

String regex = "(?<=\\w\\.)\\d+(?=\\.\\w)|(?<=\\w\\.\\[)\\d+(?=]\\.\\w)";