Pycaret setup train_size=1.0 parameter

215 views Asked by At

Using pycaret setup() for only training , such that there is no test set data when searching for the best binary classification models.

Hi there,

Does anyone know if you can set the train_size=1.0 such that you want pycaret to perform only training to your input dataset and then use best to perform prediction on unseen data? At the moment I get errors when I set train_size =1.0

Thanks a lot for any input on this. Cheers

1

There are 1 answers

1
Tatchai S. On

You can't set the train_size=1.0 if you want to train entire dataset you can do in 2 way

  1. If you already have a testing dataset and want to train a model using the entire training dataset then you can supply testing dataset in the test_data parameter, when you specify test_data then train_size will be ignored.
setup(data=training_dataset, test_data=testing_dataset, target = 'Purchase')
  1. If you don't have a testing dataset then you need to split your dataset into training and testing datasets by specifying train_size. So you need to compare the model, analyze the model, and evaluate the model. Once you got the best model you can use the finalize_model method to trains a given estimator on the entire dataset including the holdout set.
setup(data=dataset, train_size=0.7, target = 'Purchase')

best = compare_models()

# Finalize model
final_best = finalize_model(best)

You can find information about the method and parameters from documentation