I only started python recently and have never written any code before. I used a regular expression to match a string in the input file (which was successful) but I am really struggling to find a way to replace that string in the file with another using a regular expression.

with open( fileToSearch, "r+" ) as file:
for line in fileinput.input( fileToSearch ):
    string4=line
    result1 = re.search(r'(KNOWLEDGECENTER\/.*?\/)' + re.escape(taxonomy), string4)
    print (result1)
    result2 = re.sub(result1, r'(KNOWLEDGECENTER\/\t(\1)\/\)' + taxonomy, string4)
    print (result2)
    file.write(result2)  

I expected that re.sub would replace the string in the result1 variable with the replacement string but instead, I am getting the following error:

raise TypeError, "first argument must be string or compiled pattern" TypeError: first argument must be string or compiled pattern

If I put the result1 variable in quotes in the re.sub statement, as shown below, I don't get an error but the input file doesn't get updated with the replacement string

result2 = re.sub('result1', r'(KNOWLEDGECENTER\/\t(\1)\/\)' +  
taxonomy, string4)

re.search code appears to work as print (result1) returns: <_sre.SRE_Match object at 0x02A120E0> for each line in the input file

2 Answers

0
blhsing On Best Solutions

Since re.sub itself can perform a search, you don't need a separate call to re.search. In fact, you will lose the capture group in the regex for your call to re.search in the resulting match so the backreference in the replacement string in your call to re.sub won't be able to refer to anything. Combine to two calls and it will work (the below example code assumes that all you want to do is to add a tab after KNOWLEDGECENTER/:

for line in fileinput.input(fileToSearch):
    result = re.sub('(KNOWLEDGECENTER/)(.*?/' + re.escape(taxonomy) + ')', r'\1\t\2', line)
    file.write(result)
0
dariober On

search returns an object (MatchObject) with various attributes related to regex match, not a string or compiled pattern hence the error. Maybe what you want is re.sub(results1.group(0), ...)

(By the way, you have python 2.7 as keyword. If that's the version you are using consider upgrading to python 3 instead)