I need to grab the number from strings like these:
- "prefix.[10].suffix
- "prefix.10.suffix"
My first intuition was to do this:
\w+\.((\[(?<number>\d+)\])|(?<number>\d+))\.\w+
But this fails with an error: "? A subpattern name must be unique".
\d+)\]" /> \d+)\]" /> \d+)\]"/>
I need to grab the number from strings like these:
My first intuition was to do this:
\w+\.((\[(?<number>\d+)\])|(?<number>\d+))\.\w+
But this fails with an error: "? A subpattern name must be unique".
On
Your regex is fine, but, you can't have 2 paren groups and name them both 'number'. I know you want to grab 'whichever one matched' but that's just not how regexes work - you have to fetch group 1 and if that is null, fetch group 2 instead. Name them what you want, or don't name them:
Pattern p = Pattern.compile("\\w+\\.((\\[(\\d+)\\])|(\\d+))\\.\\w+");
Matcher m = p.matcher("prefix.[10].suffix");
if (m.matches()) {
String raw = m.group(1);
if (raw == null) raw = m.group(2);
return Integer.parseInt(raw);
}
On
If you are only interested in the number, then you don't need the quantifier + for the word characters \w+ as 1 will also suffice.
Java supports lookaround assertions which would allow you to use an alternation for both scenario's and get a match only without using a group.
(?<=\w\.)\d+(?=\.\w)|(?<=\w\.\[)\d+(?=]\.\w)
The pattern matches:
(?<=\w\.)\d+(?=\.\w) Assert a word char followed by a dot to the left, match 1+ digits and then assert a dot and a word char to the right| Or(?<=\w\.\[)\d+(?=]\.\w) Same as the first part, but now with [ and ] in the assertionsIn Java with the doubled escapes:
String regex = "(?<=\\w\\.)\\d+(?=\\.\\w)|(?<=\\w\\.\\[)\\d+(?=]\\.\\w)";
(\.\[*(.*)\]*\.)this will match with .[10]. or .10.There will be two groups in the match. Group 1 will be .[10]. and Group 2 will be 10 You can get your required value in group 2.
\.- Expecting a dot\[*- Expecting 0 or more of [(.*)- Match with any character\]*- Expecting 0 or more of ]\.- Expecting a dotBasically (.*) this group will capture anything between .[ and ]. where [ and ] are optional. You can replace [ and ] here either with any characterset according to your requirement.
For example if there can be any character except digits in place of [ and ]