How R automatically coerces character input to numeric?

274 views Asked by At

I am training a random forest model in the randomForest package for my data. Some variables are in the class of character. I am pretty sure that randomForest will only take factor and numeric classes as input. So I think R automatically coerces the character into numeric.

In order for me to know how this may affect my modelling result, does anyone know how R automatically coerces the character into numeric class (like an algorithm/rule)? Or any source code I can look at?

I am using R version 4.0.1.

Thanks in advance.

An update: I checked using

getTree(mod,1,labelVar=TRUE)

And I can see that if those character variables are converted to factors, then the "split point" in the output is an integer (which means it is a categorical variable (see: https://www.rdocumentation.org/packages/randomForest/versions/4.6-14/topics/getTree)). But if not converted to factors, then the "split point" in the output is not integer.

So I guess is that R coerces the values of those character variables into numeric values? But how?

1

There are 1 answers

3
Georgery On

Not sure right now regarding the random forests in R, but I am kind of convinced, that it only takes factors. If it does take characters as well, it will convert them to factor, not to numeric.

And there is no clear conversion from character to numeric in R.