Convert strings to factors with levels, but warn when introducing NAs

39 views Asked by At

I'm preparing an old dataset for analysis, and need to convert strings to factors, using levels. I've used a (equally old) data dictionary to set the levels, but have just noticed that it's not entirely correct -- some strings in some variables are not in the data dictionary.

I'd like to prevent strings being dropped (converted to NA) without warning--ideally I'd like things to stop completely if a string is not in the level definition. Is that possible?

df <- data.frame(c1 = letters[1:3])
factor(df$c1, levels = letters[1:2])
# [1] a    b    <NA>

Happy to use dplyr, forcats or something else.

0

There are 0 answers