R: Nested Sorting

750 views Asked by At

I'm a beginner to R programming (I just finished the Coursera course) and I'm having trouble creating this nested loop.

I have a csv structured like this (there are actually 108 columns):

 Type     Status  Campaign Name    Group      Budget  Budget Type    Bids
 Campaign Active    Burritos                   500      Daily   
 Campaign Active    Tacos                      400      Daily   
 Group    Active    Burritos    Bean Burritos                         0.5
 Group    Active    Burritos    Beef Burritos                         0.5
 Group    Paused    Burritos    Chicken Burritos                      0.5
 Group    Active    Tacos       Beef Tacos                            0.5
 Group    Active    Tacos       Chicken Tacos                         0.5
 Group    Paused    Tacos       Fish Tacos                            0.5

I would like to reorder the table by campaign name then group removing paused:

 Type     Status  Campaign Name    Group      Budget  Budget Type     Bids
 Campaign Active    Burritos                    500     Daily   
 Group    Active    Burritos    Bean Burritos                         0.5
 Group    Active    Burritos    Beef Burritos                         0.5
 Campaign Active    Tacos                       400     Daily   
 Group    Active    Tacos       Beef Tacos                            0.5
 Group    Active    Tacos       Chicken Tacos                         0.5

I was going to use a series of For loops but I keep running into errors. I'm pretty sure that the rbind has errors. Also, when I create the temp.ds and temp.group.ds, I think there is am error. Probably an error in the loop, too.

Below is my code:

ds <- do.call(rbind, lapply(list.files(path=directory, full.names=TRUE), read.table, header=TRUE, sep="\t", fileEncoding="UTF-16LE", fill = TRUE, quote = ""))

valid.campaign <- ds[ which(ds$Status == "Active" & ds$Type == "Campaign"), ]

new.ds <- NULL 

for(campaign in valid.campaign$Type) {
  temp.ds <- valid.campaign[,campaign]
  valid.group <- ds[ which(ds$Status == "Active" & ds$Type == "Group"), ]  

  for (group in valid.group$Type) {
    temp.group.ds <- valid.group[,group]
    temp.ds <-rbind(temp.ds, temp.group.ds)
    rm(temp.group.ds)
    }

  if (exists("new.ds")) new.ds <- rbind(new.ds,temp.ds)
  else new.ds <- temp.ds
  rm(temp.ds)
  }
new.ds 
}
2

There are 2 answers

0
Morris Greenberg On

The dplyr and magrittr packages in R are excellent at handling these sorts of questions. Specifically, the arrange function in dplyr allows you to arrange the rows, and the filter function in dplyr allows you to delete rows:

ds %<>% arrange(CampaignName, Group)  %>% filter(Status != 'Paused')
0
ChrKoenig On

In base R I would use the following two lines of code. The first one does the ordering, the second one the subsetting. There is certainly a way to wrap it in a oneliner, but I think it is more readable like this:

ds = ds[order(ds$Campaign_Name, ds$Group),]
ds = ds[which(ds$Status != "Paused"),]

gives us:

      Type Status Campaign_Name         Group Budget Budget_Type Bids
1 Campaign Active      Burritos                  500    Daily      NA
3    Group Active      Burritos Bean Burritos     NA              0.5
4    Group Active      Burritos Beef Burritos     NA              0.5
2 Campaign Active         Tacos                  400    Daily      NA
6    Group Active         Tacos    Beef Tacos     NA              0.5
7    Group Active         Tacos Chicken Tacos     NA              0.5