Cut the variable of a datatable passed to a user defined function as parameters

158 views Asked by At

I have a user defined function in 'helpers.R' called 'percent_map' which accepts the parameters as follows:

percent_map <- function(DT,var,statelist, pal,legend.title)

where

  • DT is a datatable,
  • var is a variable present in datatable that needs to be cut in 5 equal parts
  • pal is for getting a palette name and statelist is a list US states

The result/coded value is stored back in the DT as a new variable called percents.

I have currently programmed the function as below:

    percent_map <- function(DT,var,statelist, pal, legend.title) {

      # generate vector of fill colors for map
      shades <- RColorBrewer::brewer.pal(5, pal)

      # constrain gradient to percents that occur between variable range
      ##********Error in the below statement*************
      DT[,percents:=as.integer(cut(DT[,var], 5, include.lowest = TRUE))]



      names <- DT[match(map("state", plot=FALSE)$names,as.character(tolower(DT[,statelist]))),statelist]
      colorsmatched<-DT[match(names,as.character(DT[,statelist])),percents]
      fills <- shades[colorsmatched]  

#map plotting function here 
    }

I am getting the following error on passing my datatable visabystate

 percent_map(DT=visabystate,var = 'casesbystate',statelist='employer_state', pal = "Greens",legend.title="No. of cases")
 Hide Traceback

 Rerun with Debug
 Error in cut.default(DT[, var], 5, include.lowest = TRUE) : 
  'x' must be numeric 
8 stop("'x' must be numeric") 
7 cut.default(DT[, var], 5, include.lowest = TRUE) 
6 cut(DT[, var], 5, include.lowest = TRUE) 
5 eval(expr, envir, enclos) 
4 eval(jsub, SDenv, parent.frame()) 
3 `[.data.table`(DT, , `:=`(percents, as.integer(cut(DT[, var], 
    5, include.lowest = TRUE)))) at helpers.R#11
2 DT[, `:=`(percents, as.integer(cut(DT[, var], 5, include.lowest = TRUE)))] at helpers.R#11
1 percent_map(DT = visabystate, var = "casesbystate", statelist = "employer_state", 
    pal = "Greens", legend.title = "No. of cases") 

How to solve this such that the end result is the same? Here is the str(visabystate)

Classes ‘data.table’ and 'data.frame':  50 obs. of  4 variables:
 $ employer_state : Factor w/ 50 levels "Alabama","Alaska",..: 23 36 43 46 5 10 30 33 22 6 ...
 $ casesbystate   : int  2359 603 58586 13080 62708 11107 57028 15313 14347 2247 ...
 $ fulltimebystate: int  4657 1184 116310 25319 122853 22005 113501 30554 28568 4429 ...
 $ workersbystate : int  3113 645 120640 21117 125647 14395 119051 15634 35751 2659 ...
 - attr(*, ".internal.selfref")=<externalptr> 

Things I have tried:

The expression works interactively if i use

visabystate[,percents:=as.integer(cut(visabystate[,casesbystate], 5, include.lowest = TRUE))]

However, it does not work if DT$var is used inside cut within the percent_map function. It gives the same error.

EDIT

Added with=FALSE in the cut statement as below

DT[,percents:=as.integer(cut(DT[,var,with=FALSE], 5, include.lowest = TRUE))]

It gives the additional messages as below:

The following objects are masked from DT (pos = 3):

    casesbystate, employer_state, fulltimebystate, workersbystate

The following objects are masked from DT (pos = 4):

    casesbystate, employer_state, fulltimebystate, workersbystate

The following objects are masked from DT (pos = 5):

    casesbystate, employer_state, fulltimebystate, workersbystate

 Show Traceback

 Rerun with Debug
 Error in cut.default(DT[, var, with = FALSE], 5, include.lowest = TRUE) : 
  'x' must be numeric 

Thanks in advance!

0

There are 0 answers