I have this line to remove all non-alphanumeric characters except spaces
re.sub(r'\W+', '', s)
Although, it still keeps non-English characters.
For example if I have
re.sub(r'\W+', '', 'This is a sentence, and here are non-english 托利 苏 !!11')
I want to get as output:
> 'This is a sentence and here are non-english 11'