How does the cut function address null/missing values?

1k views Asked by At

I'm trying to use the cut() function in R to group continuous variables into buckets, like this:

as.character(cut(ORIG_AMT, breaks = c(-Inf, 0, 25000, 50000, 75000, 100000, 125000, 150000, 175000, 200000, 250000, 300000, 350000, 418000, Inf)
                                      , labels = c('Missing', '[0-25k)', '[25k-50k)', '[50k-75k)', '[75k-100k)', '[100k-125k)', '[125k-150k)','[150k-175k)', '[175k-200k)', '[200k-250k)', '[250k-300k)', '[300k-350k)', '[350k-418k)', '[418k+)'), right = FALSE, ordered = TRUE))

However, missing values are being omitted. I can't seem to find anywhere online that addresses this issue. Ideally, the missing values would all be grouped into the 'Missing' bucket.

Ultimately, I want to take weighed averages across these buckets. If there's a better way to approach this problem than with cut() and xtab() I'm open to it!

0

There are 0 answers