I am trying to plot a classification tree using rpart, and R isn't including all of my variables. I have 20 cases, and 200 variables. My data looks something like this:
data <- data.frame(y = c(rep(0, 10), rep(1, 10)), x1 = c(rnorm(20)), x2 = c(rnorm(20)+0.5), x3 = c(rnorm(20)-0.2))
but with x1 + x2 +x3 .... + x200.
All of my variable values are similar to this; very small, some even averaging around 0.0005.I need a classification tree, and my y is binary, so I want method = "class" :
cart <- rpart(formula = y ~ ., data = data, method = "class")
When I type
print(cart)
I get:
n= 20
node), split, n, deviance, yval
* denotes terminal node
1) root 20 5.958333 0.4583333
2) x50< 0.0005126315 16 2.437500 0.1875000 *
3) x50>=0.0005126315 8 0.000000 1.0000000 *
I'm not sure why it's only splitting according to x50. I tried plotting it to see what was going on, and when I did
plot(cart)
I got the following plot: https://i.stack.imgur.com/5ZOFN.png
Any idea what's going on, or how I can fix this? Much appreciated.