I am learning 're
' part of Python, and the named pattern (?P=name)
confused me,
When I using re.sub()
to make some exchange for digit and character, the patter '(?P=name)
' doesn't work, but the pattern '\N
' and '\g<name>
' still make sense. Code below:
[IN]print(re.sub(r'(?P<digit>\d{3})-(?P<char>\w{4})', r'(?P=char)-(?P=digit)', '123-abcd'))
[OUT] (?P=char)-(?P=digit)
[IN] print(re.sub(r'(?P<digit>\d{3})-(?P<char>\w{4})', r'\2-\1', '123-abcd'))
[OUT] abcd-123
[IN] print(re.sub(r'(?P<digit>\d{3})-(?P<char>\w{4})', r'\g<char>-\g<digit>', '123-abcd'))
[OUT] abcd-123
Why it failed to make substitute when I use (?P=name)
?
And how to use it correctly?
I am using Python 3.5
The
(?P=name)
is an inline (in-pattern) backreference. You may use it inside a regular expression pattern to match the same content as is captured by the corresponding named capturing group, see the Python Regular Expression Syntax reference:See this demo:
(?P<digit>\d{3})-(?P<char>\w{4})&(?P=char)-(?P=digit)
matches123-abcd&abcd-123
because the "digit" group matches and captures123
, "char" group capturesabcd
and then the named inline backreferences matchabcd
and123
.To replace matches, use
\1
,\g<1>
or\g<char>
syntax withre.sub
replacement pattern. Do not use(?P=name)
for that purpose: