Javascript Regex back-reference not populating all capturing groups

536 views Asked by At

Strange one here (or maybe not), I am attempting to retrieve two capturing groups via Javascript regex, first group: one or more digits (0-9), second group: one or more word characters or hyphens (A-Z, 0-9, -) but for some reason I never can retrieve the latter group.

Please note: I have purposely included the alternation (|) character as I wish to potentially receive one or the other)

This is the code I am using:

var subject = '#/34/test-data'
var myregexp = /#\/(\d+)|\/([\w-]+)/;
var match = myregexp.exec(subject);
if (match != null && match.length > 1) {
  console.log(match[1]); // returns '34' successfully
  console.log(match[2]); // undefined? should return 'test-data'
}

Funny thing is Regex Buddy tells me I do have two capturing groups and actually highlights them correctly on the test phrase.

Is this a problem in my JavaScript syntax?

2

There are 2 answers

4
Michael Low On BEST ANSWER

If you change:

var myregexp = /#\/(\d+)|\/([\w-]+)/;

by removing the | alternation meta-character to just:

var myregexp = /#\/(\d+)\/([\w-]+)/;

it will then match both groups. At present, your regex is looking for either \d+ or [\w-]+ so once it matches the first group it stops and the second will be empty. If you remove |, it's looking for \d+ followed by /, followed by [\w-]+ so it will always match either both or none.

Edit: To match on all of #/34/test-data, #/test-data or #/34, you can use #(?:\/(\d+))?\/([\w-]+) instead.

2
Deleteman On

If you remove the "|" you get the result you want... does this help?

var subject = '#/34/test-data'
var myregexp = /#\/(\d+)\/([\w-]+)/;
var match = myregexp.exec(subject);
if (match != null && match.length > 1) {
  console.log(match[1]); // returns '34' successfully
  console.log(match[2]); // undefined? should return 'test-data'
}

Happy coding!

Edit

I think your problem was, that since you were using the "|", you were telling JS to catch either the first group or the second one, and since JS eval is lazy, when it found the first group, it stopped there... By removing the OR operand from the RegExp, you get both results...(something like an AND).