I am new to coding and I am trying to build a data tree, but I keep encountering the same error:
Error in model.frame.default(formula = df ~ df$Open.Closed + df$Region, : invalid type (list) for variable 'df'
I have looked throughout the site and haven't been able to find a valid solution to my problem. I have tried multiple solutions, but I usually end up with another error that says data is a matrix, which the part won't accept. Any help would be much appreciated.
This is my code:
library(rpart.plot)
library(ggExtra)
library(gridExtra)
library(RGtk2)
library(rpart)
library(rattle)
df[] <- data.frame(lapply(Test_Bank_Model,factor))
df [col_names] <- lapply(df[col_names], factor)
str(df)
summary(df)
print(df)
tree <- rpart(df ~ df$Open.Closed + df$Region, data = df, method = "class",
model = TRUE, control = rpart.control("minsplit" = 1))
rpart.plot(tree, roundint = FALSE, box.palette = "white")
Data:
Region
Closing.Date
Annual.Average.FedFunds
Open.Closed
1 South 2020 0.2328571 Closed
2 Mid West 2020 0.2328571 Closed
3 North East 2020 0.2328571 Open
4 South 2020 0.2328571 Open
5 North East 2020 0.2328571 Open
6 West 2020 0.2328571 Open
7 North East 2020 0.2328571 Open
8 North East 2019 1.7366667 Closed
9 South 2019 1.7366667 Closed
10 Mid West 2019 1.7366667 Closed
From the error message I take that you are using a list object while you need a data frame.
lapplyreturns results as lists. I assume that is where the format changes unnoticed.I made a data frame called 'Test_Bank_Model', got the column names and excluded the 'Annual.Average.FedFunds' from converting to a factor (I'm not sure what you want to do with the years).
In
rpartyou can specify the data.frame via the data argument, as you did. When you do, you can save yourself retyping the data frames name (but I'm not aware that this is problematic; it should work too).