Regular expression for words and real numbers

95 views Asked by At

I'm having trouble building a function that extracts words and real numbers, and replaces another symbols with empty spaces.

For example:

word1, word2.word3, 1.4056 -1.2456 40,50 -60$30 60.50. 70.40.

should become

word1 word2 word3 1.4056 -1.2456 40 50 -60 30 60.50 70.40

I tried with:

re.sub(r"[^a-zA-Z0-9.-]+", " ", input_string)

I almost have it, except 60.50. doesn't change (the trailing dot is still there). Could you help me please?

2

There are 2 answers

0
AlefiyaAbbas On

Try this:

re.sub(r"(?<=\d)\.(?!\d)|[^a-zA-Z0-9.-]+", " ", input_string)

Output:

word1 word2 word3 1.4056 -1.2456 40 50 -60 30 60.50  70.40 
0
Reilas On

Instead of a find-and-replace, you can use the re.findall function, and concatenate the results using str.join.

-?\d+(?:\.\d+)?|\w+
string = 'word1, word2.word3, 1.4056 -1.2456 40,50 -60$30 60.50. 70.40.'
matches = re.findall(r'-?\d+(?:\.\d+)?|\w+', string)
string = str.join(' ', matches)

Output

word1 word2 word3 1.4056 -1.2456 40 50 -60 30 60.50 70.40