None capturing groups in regex not working as expected

30 views Asked by At

The following match patter does look for the character / not having a blank space before it, but having a blank or a dot or a line ending after it.

>>> import re
>>> re.search(r"[^ ]/([ .]|$)", "Foo /markup/ bar")                                                                                                                                   
<re.Match object; span=(10, 13), match='p/ '>

I'm not interested only in the / and its position. Here I do use a simplified regex as an MWE. In the original I'm not able to just do pos = m.start() + 1 to get the position of /.

I assume no capturing groups ((?:)) are but I can't get them work. The result I expect would be

<re.Match object; span=(11, 11), match='/'>

What do I make wrong here?

>>> re.search(r"(?:[^ ])/(?:[ .]|$)", "Foo /markup/ bar")                                                                                                                             
<re.Match object; span=(10, 13), match='p/ '>
1

There are 1 answers

6
Wiktor Stribiżew On

You can use

import re
m = re.search(r"(?<!\s)/(?![^\s.])", "Foo /markup/ bar")
if m:
    print(m.start()) # => 11

See the online Python demo.

Details:

  • (?<!\s) - a negative lookbehind that matches a location that is not immediately preceded with whitespace
  • / - a / char
  • (?![^\s.]) - a negative lookahead that requires a whitespace, . char or end of string immediately to the right of the current location.

NOTE: If you do not expect matches to be at the start of the string, replace (?<!\s) with (?<=\S) that will require any non-whitespace char to appear immediately to the left of the / char.