String line = "This order was placed for QT3000! OK?";
String pattern = "(.*)(\\d+)(.*)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(2));
System.out.println("Found value: " + m.group(3));
}
output is
Found value: This order was placed for QT300
Found value: 0
Found value: ! OK?
Though i was expecting the output as
Found value: This order was placed for QT3000! OK?
Found value: 3000
Found value: This order was placed for QT3000! OK?
The reason for my expected output is
If pattern is "(.*)" output for m.group(1) is "This order was placed for QT3000! OK?"
If pattern is "(\\d+)" output for m.group(1) is "3000"
I don't know when I mention pattern as "(.*)(\\d+)(.*)"
; why I am not getting expected output?
It is due to the first
(.*)
being too greedy and eat up as much as possible, while still allowing(\d+)(.*)
to match the rest of the string.Basically, the match goes like this. At the beginning, the first
.*
will gobble up the whole string:However, since we can't find a match for
\d+
here, we backtrack:At this position,
\d+
can be matched, so we proceed:and
.*
will match the rest of the string.That's the explanation for the output you see.
You can fix this problem by making the first
(.*)
lazy:The search for match for
(.*?)
will begin with empty string, and as it backtracks, it will gradually increase the amount of characters it gobbles up:At this point,
\d+
can be matched, and.*
can also be matched, which finishes the matching attempt and the output will be as you expected.