Confused about this nested function

96 views Asked by At

I am reading the Python Cookbook 3rd Edition and came across the topic discussed in 2.6 "Searching and Replacing Case-Insensitive Text," where the authors discuss a nested function that is like below:

def matchcase(word):
  def replace(m):
    text = m.group()
    if text.isupper():
      return word.upper()
    elif text.islower():
      return word.lower()
    elif text[0].isupper():
      return word.capitalize()
    else:
      return word
  return replace

If I have some text like below:

text = 'UPPER PYTHON, lower python, Mixed Python'  

and I print the value of 'text' before and after, the substitution happens correctly:

x = matchcase('snake')
print("Original Text:",text)

print("After regsub:", re.sub('python', matchcase('snake'), text, flags=re.IGNORECASE))

The last "print" command shows that the substitution correctly happens but I am not sure how this nested function "gets" the:

PYTHON, python, Python

as the word that needs to be substituted with:

SNAKE, snake, Snake

How does the inner function replace get its value 'm'?
When matchcase('snake') is called, word takes the value 'snake'.
Not clear on what the value of 'm' is.

Can any one help me understand this clearly, in this case?

Thanks.

1

There are 1 answers

0
elethan On

When you pass a function as the second argument to re.sub, according to the documentation:

it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

The matchcase() function itself returns the replace() function, so when you do this:

re.sub('python', matchcase('snake'), text, flags=re.IGNORECASE)

what happens is that matchcase('snake') returns replace, and then every non-overlapping occurrence of the pattern 'python' as a match object is passed to the replace function as the m argument. If this is confusing to you, don't worry; it is just generally confusing.

Here is an interactive session with a much simpler nested function that should make things clearer:

In [1]: def foo(outer_arg):
    ...:     def bar(inner_arg):
    ...:         print(outer_arg + inner_arg)
    ...:     return bar
    ...: 

In [2]: f = foo('hello')

In [3]: f('world')
helloworld

So f = foo('hello') is assigning a function that looks like the one below to a variable f:

def bar(inner_arg):
    print('hello' + inner_arg)

f can then be called like this f('world'), which is like calling bar('world'). I hope that makes things clearer.