String line = "This order was placed for QT3000! OK?";
String pattern = "(.*)(\\d+)(.*)";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found value: " + m.group(1));
System.out.println("Found value: " + m.group(2));
System.out.println("Found value: " + m.group(3));
}
output is
Found value: This order was placed for QT300
Found value: 0
Found value: ! OK?
Though i was expecting the output as
Found value: This order was placed for QT3000! OK?
Found value: 3000
Found value: This order was placed for QT3000! OK?
The reason for my expected output is
If pattern is "(.*)" output for m.group(1) is "This order was placed for QT3000! OK?"
If pattern is "(\\d+)" output for m.group(1) is "3000"
I don't know when I mention pattern as "(.*)(\\d+)(.*)"; why I am not getting expected output?
It is due to the first
(.*)being too greedy and eat up as much as possible, while still allowing(\d+)(.*)to match the rest of the string.Basically, the match goes like this. At the beginning, the first
.*will gobble up the whole string:However, since we can't find a match for
\d+here, we backtrack:At this position,
\d+can be matched, so we proceed:and
.*will match the rest of the string.That's the explanation for the output you see.
You can fix this problem by making the first
(.*)lazy:The search for match for
(.*?)will begin with empty string, and as it backtracks, it will gradually increase the amount of characters it gobbles up:At this point,
\d+can be matched, and.*can also be matched, which finishes the matching attempt and the output will be as you expected.