Prediction with Caret package in R

47 views Asked by At

I am a beginner in utilizing the Caret package in R. I have successfully employed the "nnet" package to develop a model. However, I am now interested in using Caret for certain purposes. Specifically, I have a matrix input of dimensions [Z x N] (with samples in rows), and the target (or output) is of dimensions [Z x M].

In the "nnet" package, I was able to train the model without employing a formula, using the following code.

nnet(x = train_inputs, y = train_targets, ...)

But with Caret i get this error: Error: Please make sure that the outcome column is a factor or numeric. The class(es) of the column: 'data.frame'

So, I use the this code for using formula:

names1 <- colnames(target_data) #Target
names2 = colnames(input_data) #Input

formula_net <- as.formula(paste(paste(names1,collapse='+', sep = ""),' ~ ' 
                                ,paste(names2,collapse='+', sep = "")))
> formula_net
Target1 + Target2 + Target3 + Target4 + Target5 ~ Input1 + Input2 + Input3

And here is my Caret code. train_data is 80% of cbind(target_data, input_data).

net <- train(formula_net, data=train_data,
             method = 'nnet',
             preProcess = NULL,
             verbose = FALSE,
             MaxNWts=1000,
             linout =T,
             maxit=1000)

It seems it works well and here is the summary of net:

> net
Neural Network 

16000 samples
   7 predictor

No pre-processing
Resampling: Bootstrapped (25 reps) 
Summary of sample sizes: 16000, 16000, 16000, 16000, 16000, 16000, ... 
Resampling results:

  RMSE         Rsquared   MAE         
  0.000494634  0.9998713  0.0004172757

Tuning parameter 'size' was held constant at a value of 2
Tuning parameter 'decay' was held constant at a value
 of 0.005

I don't know why it says 7 predictor? I have 3 inputs and 5 outputs. Is there an error in my formula?

Now, my problem is when I want to predict the network based on the test_data. Even though that net is trained well, the answers are incorrect. My data are between 0 and 1, but the predicted range is incorrect. Here is the predict code:

prediction_target <- predict(net, test_data, type = "raw")

test_data is also include both input and target columns. Is that correct? Because in nnet package, I used: test_inputs = test_data[, -target_cols] to exclude target data from test_data, and then predicted based on them. But since I'm using formula here, I think test_data is correct.(?) Additionally, the predicted_target values are in numeric format, unlike the target matrix provided by the nnet package.

What am I doing wrong here?

0

There are 0 answers