R Trying to run dfsummary and freqs on multiple subsets of a dataset as a macro

163 views Asked by At

Orange is a default installed dataset out of the datasets package, and it's the closest i have to my real data. I added one additional column with text with spaces, since that represents that column in my real data that also has spaces and what it needs to be subset as. And i know i can just copy and paste the dfsummary and freq code 13 times (how many i need for real), but i really don't want to do that. Can anyone get this to work? I want the new datasets to be Tree1, Tree2, Tree3, Tree4, and Tree5, but the paste() function doesn't like what i wrote, and i want to get a dfsummary and freqs (from summarytools) for each subset.

orange <- data.frame(Orange)

#Add a another variable to play with.
orange$row[orange$Tree==1] <- "Row 1"
orange$row[orange$Tree==2] <- "Row 2"
orange$row[orange$Tree==3] <- "Row 3"
orange$row[orange$Tree==4] <- "Row 4"
orange$row[orange$Tree==5] <- "Row 5"

#start macro
bytree <- defmacro(df, tree, row,
                     expr={

                       #subset for tree
                       paste(Tree,tree) <- subset(df, row==row)

                       #write out the dfsummary info
                       #Be sure to include the varnumbers=FALSE or you'll have the 1, 2, 3, on the left side.
                       dfSummary(paste(Tree,tree), style = "grid", plain.ascii = TRUE,
                                 varnumbers = FALSE, valid.col = FALSE, tmp.img.dir = "./img")

                       freq(paste(Tree,tree)[ ,c("age", "circumference")])

                     })


bytree(orange,1,"Row 1")
bytree(orange,2,"Row 2")
bytree(orange,3,"Row 3")
bytree(orange,4,"Row 4")
bytree(orange,5,"Row 5")
1

There are 1 answers

0
Dominic Comtois On

Not sure if you've found a solution, but here is one for future reference:

orange <- data.frame(Orange)

# For freq, we need to call it separately for the 2 variables
stby(data=orange$age, orange$Tree, freq)
stby(data=orange$circumference, orange$Tree, freq)

stby(data=orange, orange$Tree, dfSummary)