regex match and extract the word before parenthesis

65 views Asked by At

I have a line of strings in a file that i need to iterate over. Each line is of the form:

returnType ClassType functionName(int param1, double param2)
returnType ClassType functionName();

I want to iterate over each line and grab just the function name functionName via regex.

I have a solution without regex:

line.split("(")[0].split(" ")[-1]

which returns functionName

I would like to get a solution with regex if thats possible. Thanks This is what i have tried:

import re
pattern = re.compile(r"\w+^\(")

where I'm saying, match any character that contains a parenthesis and give me that item. But its returning empty for me thanks

3

There are 3 answers

0
David Waterworth On BEST ANSWER

A really basic starting point would be:

\s(\w+)\(

This captures one of more characters in the set [a-zA-Z0-9_] that are preceded by a space and followed by an open parenthesis.

You may need to tweak based on the grammar of the language you're trying to match.

0
Bharel On

You can use this:

s = """returnType ClassType functionName(int param1, double param2)
returnType ClassType functionName();"""

results = re.finditer(r"^(?P<return>\w+) (?P<class>\w+) (?P<func>[^(]+)\((?P<params>[^)]*)", s)
for result in results:
    print(result.groupdict())
0
The fourth bird On
  • Your working solution line.split("(")[0].split(" ")[-1] is very broad, and would for example also return a form the string "a("

  • The pattern \w+^\( that you tried, does not match, because after matching 1 or more word characters, you assert the start of the string with ^ which will never match. You can get a match with \w+\( but then you would have to remove the parenthesis at the end

Assuming that you want to allow only word characters for the function name, you could make the match more specific by asserting a whitespace boundary to the left, and an opening to closing parenthesis to the right:

(?<!\S)(\w+)\([^()]*\)

The pattern matches

  • (?<!\S) Assert a whitespace boundary to the left
  • (\w+) Capture 1 or more word characters in group 1
  • \([^()]*\) Match from ( to ) without any occurrence of a parenthesis in between

Regex demo | Python demo

Example

import re

pattern = r"(?<!\S)(\w+)\([^()]*\)"

s = ("returnType ClassType functionName(int param1, double param2)\n"
            "returnType ClassType functionName();\n"
            "a(\n"
            " a(")

print(re.findall(pattern, s))

Output

['functionName', 'functionName']