I have a chunk of code in R that I would like to insert in my python code. To that aim I am using rpy2. The R code involves many regular expressions and it seems that rpy2 is not handling them correctly or perhaps I am not coding them adequately.
Here is an example of a piece of code that words and another that does not work:
1) It works: A very trivial removeStopWords function:
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
robjects.r('''
library(data.table)
library(tm)
removeStopWords <- function(x) gsub(" ", " ", removeWords(x, stopwords("english")))
''')
In [4]: r_f = robjects.r['removeStopWords']
In [5]: r_f('I want to dance')[0]
Out[5]: 'I want dance'
2) it does not work: an also trivial function to remove leading and trailing spaces:
robjects.r('''
library(data.table)
library(tm)
trim <- function (x) gsub("^\\s+|\\s+$", "", x)
''')
Error: '\s' is an unrecognized escape in character string starting ""^\s"
p = rinterface.parse(string)
Abort
and the I am "expelled out" from IPython
I have tried directly:
import rpy2.rinterface as ri
exp = ri.parse('trim <- function (x) gsub("^\\s+|\\s+$", "", x)')
but the result is the same, Abort
and then out of IPython
At this stage I don't really know what to try. The R code is quite large so moving all from R to python would take me some time...and I would prefer not having to do such a thing.
Any help is much appreciated!
Thanks in advance for your time.
When you write
\\
in a string in Python, it is stored as\
since\
is an escaping character. So when R executes the code, it sees"^\s+|\s+$"
. But\
is also and escaping character in R and\s
not recognized as any escaped character.If you want R to recieve
"^\\s+|\\s+$"
, you need to write"^\\\\s+|\\\\s+$"
in Python(twice the number of backslashes).