Automatically remove leading u'...' in Python strings

146 views Asked by At

I am working on migrating an old Python code base to Python3.

There are many strings which have the "u" prefix. Example u'Umlaut üöö'

Is there an automated way to remove the leading "u"?

A simple regex is does not work:

u'schibu': u' at the end must not get removed.

Example2:

Multiline: '''foo

schibu'''

Is there maybe a way which works without a regex, but via parsing the python syntax?

Update

My code needs to be compatible with Python2 and Python3 for some months.

The files already contain from __future__ import unicode_literals

1

There are 1 answers

2
buran On

Using 2to3 tool unicode fixer should do that.

unicode

Renames unicode to str.

Dry run with sample spam.py file

eggs = u'foo'

in shell:

$ 2to3 --fix unicode spam.py

output

root: Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
RefactoringTool: Refactored spam.py
--- spam.py     (original)
+++ spam.py     (refactored)
@@ -1 +1 @@
-eggs = u'foo'
+eggs = 'foo'
RefactoringTool: Files that need to be modified:
RefactoringTool: spam.py

EDIT: Note, you can run just a single fixer as shown above (in a dry run) and it will apply only the respective change.