How to exclude part of alternative from capture?

80 views Asked by At

There is an regex: ((?:description|speed|type|peers)\s+set|classify). How to exclude \s+set from capture group?

There must be only description or speed or type or peers or classify.

We can do it so:

pattern = '^\s+"([A-Za-z]+)\.([_A-Za-z0-9-]+)"\s+"([^\s]+)"\s+((description|speed|type|peers)\s+set|classify)\s+"?(.+)"?'
p = re.compile(pattern)
path = 'some_file'
fd = open(path)
for l in fd.readlines():
    m = p.search(l)
    if not m:
        continue
    g = m.groups()
    if g[4]:
        (region, host, interface, cmd, value) = g[0].lower(), g[1].lower(), g[2].lower(), g[4], g[5]
    else:
        (region, host, interface, cmd, value) = g[0].lower(), g[1].lower(), g[2].lower(), g[3], g[5]

But it is ugly...

if g[4]:
    (region, host, interface, cmd, value) = g[0].lower(), g[1].lower(), g[2].lower(), g[4], g[5]
else:
    (region, host, interface, cmd, value) = g[0].lower(), g[1].lower(), g[2].lower(), g[3], g[5]

How to cut \s+set witihin regex engine and have only one line in code:

(region, host, interface, cmd, value) = g[0].lower(), g[1].lower(), g[2].lower(), g[3], g[4] ?

1

There are 1 answers

0
Andris Leduskrasts On BEST ANSWER

If you don't mind multiple capture groups (and therefore slightly altering the rest of the code), it's super easy - just do the opposite of what you're doing.

(?:(description|speed|type|peers)\s+set|(classify)) as seen in https://regex101.com/r/bR1nV7/1

If you don't want it, you can use lookarounds. ((?:description|speed|type|peers)(?=\s+set)|classify) as seen in https://regex101.com/r/bR1nV7/2

There is no "exclude this thing" in regex because the other tools like non-capture groups and lookarounds do it for you.