Assigning dataframe columns in rpy2

284 views Asked by At

In rpy2 what is the equivalent of say:

dataf <- data.frame(a=c(1,2,3), b=c(4,5,6))
dataf$a <- dataf$a + 1

Since dataframe.rx2 is the rpy2 equivalent of [[.dataframe I would have thought that the answer would have been:

 d = {'a': robjects.IntVector((1,2,3)), 'b': robjects.IntVector((4,5,6))}
 dataf = robjects.DataFrame(d)
 dataf.rx2["a"] = dataf.rx2("a").ro + 1

but that gives the following error:

 RRuntimeError: Error in `[[<-.data.frame`(list(a = 1:3, b = 4:6), "a", 2:4) : 
    argument "value" is missing, with no default

I also tried:

dataf.rx["a"] = dataf.rx("a").ro + 1
dataf[dataf.index("a")] = dataf.rx2("a").ro + 1

without any luck

2

There are 2 answers

0
zero323 On BEST ANSWER

This should work:

i = dataf.colnames.index('a')
dataf[i] = dataf[i].ro + 1
0
Ian Sudbery On

So it turns out this also works:

dataf.rx[True, 'a'] = dataf.rx(True, 'a').ro + 1

EDIT:

But this solution isn't exactly equivalent to the above solution. It works in this case, but doesn't in others.

e.g.

In [18]: d = {"a": ro.StrVector(["a","b","c"]), "b": ro.IntVector([1,2,3])}

In [19]: dataf = ro.DataFrame(d)

In [20]: print ro.r.levels(dataf.rx2("a"))
[1] a b c

In [21]: dataf.rx[True, "a"] = ro.r.relevel(dataf.rx2("a"), "b")

In [22]: print ro.r.levels(dataf.rx2("a"))
[1] a b c

where as the accepted solution does:

In [23]: i = dataf.colnames.index("a")

In [24]: dataf[i] = ro.r.relevel(dataf.rx2("a"), "b")

In [25]: print ro.r.levels(dataf.rx2("a"))
[1] b a c