How to get the matching word in a regex with alternations?

112 views Asked by At

In python, suppose I want to search the string

"123" 

for occurrences of the pattern

"abc|1.*|def|.23" .

I would currently do this as follows:

import re
re.match ("abc|1.*|def|.23", "123") .

The above returns a match object from which I can retrieve the starting and ending indices of the match in the string, which in this case would be 0 and 3.

My question is: How can I retrieve the particular word(s) in the regular expression which matched with

"123" ?

In other words: I would like to get "1.*" and ".23". Is this possible?

2

There are 2 answers

0
JaySabir On

Given your string always have a common separator - in our case "|"

you can try:

str = "abc|1.*|def|.23"

matches = [s for s in str.split("|") if re.match(s, "123")]
print(matches)

output:

['1.*', '.23']
2
Cary Swoveland On

Another approach would be to create one capture group for each token in the alternation:

import re

s = 'def'
rgx = r'\b(?:(abc)|(1.*)|(def)|(.23))\b'

m = re.match(rgx, s)
print(m.group(0)) #=> def
print(m.group(1)) #=> None
print(m.group(2)) #=> None
print(m.group(3)) #=> def
print(m.group(4)) #=> None

This example shows the match is 'def' and was matched by the 3rd capture group,(def).

Python code