Why does the following regex return 101
instead of 1001
?
console.log(new RegExp(/1(0+)1/).exec('101001')[0]);
I thought that +
was greedy, so the longer of the two matches should be returned.
IMO this is different from Using javascript regexp to find the first AND longest match because I don't care about the first, just the longest. Can someone correct my definition of greedy? For example, what is the difference between the above snippet and the classic "oops, too greedy" example of new RegExp(/<(.+)>/).exec('<b>a</b>')[0]
giving b>a</b
?
(Note: This seems to be language-agnostic (it also happens in Perl), but just for ease of running it in-browser I've used JavaScript here.)
Greedy means up to the rightmost occurrence, it never means the longest in the input string.
Regex itself is not the correct tool to extract the longest match. You might get all the substrings that match your pattern, and get the longest one using the language specific means.
Since the string is parsed from left to right,
101
will get matched in101001
first, and the rest (001
) will not match (as the101
and1001
matches are overlapping). You might use/(?=(10+1))./g
and then check the length of each Group 1 value to get the longest one.