Regex capturing from a non capture group in ruby

692 views Asked by At

I am trying to fix a bit of regex I have for a chatops bot for lita. I have the following regex:

/^(?:how\s+do\s+I\s+you\s+get\s+far\s+is\s+it\s+from\s+)?(.+)\s+to\s+(.+)/i

This is supposed to capture the words before and after 'to', with optional words in front that can form questions like: How do I get from x to y, how far from x to y, how far is it from x to y.

expected output:

match 1 : "x"
match 2 : "y"

For the most part my optional words work as expected. But when I pull my response matches, I get the words leading up to the first capture group included.

So, how far is it from sfo to lax should return:

sfo and lax.

But instead returns:

how far is it from sfo and lax

2

There are 2 answers

6
joelparkerhenderson On BEST ANSWER

Your glitch is that the first chunk of your regex doesn't make sense.

To choose from multiple options, use this syntax:

(a|b|c)

What I think you're trying to do is this:

/^(?:(?:how|do|I|you|get|far|is|it|from)\s+)*(.+)\s+to\s+(.+)/i

The regexp says to skip all the words in the multiple options, regardless of order.

If you want to preserve word order, you can use regexps such as this pseudocode:

… how (can|do|will) (I|you|we) (get|go|travel) from …
1
Wiktor Stribiżew On

When you want to match words, \w is the most natural pattern I'd use (e.g., it is used in word count tools.)

To capture any 1 word before and after a "to" can be done with (\w+\sto\s+\w*) regex.

To return them as 2 different groups, you can use (\w+)\s+to\s+(\w+).

Have a look at the demo.