Why do I get "Error in check.data(data, allow.levels = TRUE)" when using predict in bnlearn

87 views Asked by At

The training data:

With R I need to then make 2 predictions (Buy Computer: YES/NO) based on these features

Essentially to say whether it would be Yes or No for each of the two. I've tried the code below and get the error

Error in check.data(data, allow.levels = TRUE) : the data are missing.

> library(bnlearn)
> 
> data_computer <- data.frame(predictions.table)
> data_computer$Income <- as.factor(data_computer$Income)
> data_computer$Student <- as.factor(data_computer$Student)
> data_computer$Credit.Rating <- as.factor(data_computer$Credit.Rating)
> data_computer$Buy.Computer <- as.factor(data_computer$Buy.Computer)
> 
> network_structure <- empty.graph(nodes = c("Income","Student","Credit.Rating","Buy.Computer"))
> 
> network_structure <- set.arc(network_structure,"Income","Buy.Computer")
> network_structure <- set.arc(network_structure,"Student","Buy.Computer")
> network_structure <- set.arc(network_structure,"Credit.Rating","Buy.Computer")
> 
> learned.network <- bn.fit(network_structure, data_computer)
> 
> data_computer_test <- data.frame(
+     Income = c("High", "Low"),
+     Student = c("FALSE", "FALSE"),
+     Credit.Rating = c("Fair", "Excellent")
+ )
> 
> data_computer_test$Income <- as.factor(data_computer_test$Income)
> data_computer_test$Student <- as.factor(data_computer_test$Student)
> data_computer_test$Credit.Rating <- as.factor(data_computer_test$Credit.Rating)
> 
> new_predictions <- predict(learned.network, newdata=data_computer_test, node="Buy.Computer", method="bayes-lw")

Error in check.data(data, allow.levels = TRUE) : the data are missing.

Why do I get this error?

1

There are 1 answers

0
PGSA On

From the documentation for predict() (link):

Usage
## S3 method for class 'bn.fit'
predict(object, node, data, cluster, method = "parents", ...,
  prob = FALSE, debug = FALSE)

The minimum required arguments are object, node, and data. (cluster is optional, and method, prob and debug have default values)

Your code:

new_predictions <- predict(learned.network, newdata=data_computer_test, node="Buy.Computer", method="bayes-lw")

R will correctly assume that the unnamed first argument is the object. All the others are named so will be assigned to arguments with matching names. There is no expected argument named newdata so this is passed to the ... and you are left with no data, hence the error message.

Try this:

new_predictions <- predict(
    object = learned.network, 
    data = data_computer_test, 
    node = "Buy.Computer", 
    method = "bayes-lw")