R dplyr - error in subsetting of local data frame

1.4k views Asked by At

As part of a larger and more complex body of code, I am running into a dplyr / local data frame challenge. As the simplified example per below shows, the code includes a basic type of subsetting that works in base R:

#creation of data frame
dat=data.frame(group=c(rep(c("a","b","c","d"),2)),value=(seq(1,8,1)))
othergroup=dat[dat[,"group"]==dat[2,"group"],]
othergroup 

This gives the desired answer:

 group value
2     b     2
6     b     6

#loading dplyr
require(dplyr)               
othergroup=dat[dat[,"group"]==dat[2,"group"],]
othergroup #still works

After just loading dplyr, all still works. However, after I run a dplyr operation, then a local data frame is created that no longer allows similar subsetting.

#pro-forma dplyr operation
dat = dat %>%
  group_by(group)

othergroup=dat[dat[,"group"]==dat[2,"group"],] #error message

Error in Ops.data.frame(dat[, "group"], dat[2, "group"]) : 
‘==’ only defined for equally-sized data frames

I understand that one can use the select function in dplyr, but I would like to re-use some existing code. Is there a way to coerce a dplyr generated "local data frame" back into a regular data frame?

2

There are 2 answers

0
Miss.Saturn On

Just do this:

othergroup = dat[dat$group == dat$group[2],]
0
dule arnaux On

Once you group the data frame it becomes a tibble object. One of the features of a tibble is that when you subset it (e.g. dat[2,"group"]), it always returns a tibble. So dat[,"group"]==dat[2,"group"] is comparing the whole tibble/data.frame. Not what you want.

If you have lots of this kind of sub-setting in old code and you don't want to change your old code, convert the tibble back to a data frame: dat=as.data.frame(dat).

Otherwise, Tatiana's solution works well.