I'm encountering issues while attempting to run a regression problem with hyperparameter tuning using Google Colab with GPU enabled. I've outlined the steps I've taken and the problem I'm facing.
Here are the steps I followed:
- Enabled GPU on Google Colab.
- Installed the
pycaret
library with the command!pip install pycaret[full]
and restarted session afterwards. - Checked the installed version using
import pycaret
andprint(pycaret.__version__)
, which returned version 3.2.0. - Used
pycaret.show_versions()
to display system and library information (included at the end). - Checked the CUDA version with
!nvcc --version
, confirming version 11.8. - Installed cuML using
!pip install --extra-index-url=https://pypi.nvidia.com cuml-cu11
and restarted session afterwards. - Checked cuML version with
import cuml
andprint(cuml.__version__)
, which returned version 23.10.00.
After loading data with pandas, I attempted to set up a regression model using:
from pycaret.regression import *
s = setup(data, target=target_column, session_id=109, preprocess=True, use_gpu=True,
pca=True, normalize=True, polynomial_features=False, feature_selection=False,
pca_components=0.95, normalize_method="zscore", fold=5, train_size=0.8)
best = compare_models(sort="MSE")
However, during the process, I encountered warnings and errors, such as:
[LightGBM] [Warning] There are no meaningful features which satisfy the provided configuration. Decreasing Dataset parameters min_data_in_bin or min_data_in_leaf and re-constructing Dataset might resolve this warning.
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 0
[LightGBM] [Info] Number of data points in the train set: 2, number of used features: 0
[LightGBM] [Warning] Using sparse features with CUDA is currently not supported.
Subsequently, the session crashes and restarts automatically with the message:
[I] [14:06:56.249093] Unused keyword parameter: n_jobs during cuML estimator initialization
pycaret.show_versions()
has the following output:
System:
python: 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
executable: /usr/bin/python3
machine: Linux-5.15.120+-x86_64-with-glibc2.35
PyCaret required dependencies:
pip: 23.1.2
setuptools: 67.7.2
pycaret: 3.2.0
IPython: 7.34.0
ipywidgets: 7.7.1
tqdm: 4.66.1
numpy: 1.23.5
pandas: 1.5.3
jinja2: 3.1.2
scipy: 1.10.1
joblib: 1.3.2
sklearn: 1.2.2
pyod: 1.1.1
imblearn: 0.10.1
category_encoders: 2.6.3
lightgbm: 4.1.0
numba: 0.57.1
requests: 2.31.0
matplotlib: 3.6.0
scikitplot: 0.3.7
yellowbrick: 1.5
plotly: 5.15.0
plotly-resampler: Not installed
kaleido: 0.2.1
schemdraw: 0.15
statsmodels: 0.14.0
sktime: 0.21.1
tbats: 1.1.3
pmdarima: 2.0.4
psutil: 5.9.5
markupsafe: 2.1.3
pickle5: Not installed
cloudpickle: 2.2.1
deprecation: 2.1.0
xxhash: 3.4.1
wurlitzer: 3.0.3
PyCaret optional dependencies:
shap: 0.43.0
interpret: 0.4.4
umap: 0.5.4
ydata_profiling: 4.6.0
explainerdashboard: 0.4.3
autoviz: Not installed
fairlearn: 0.7.0
deepchecks: Not installed
xgboost: 2.0.1
catboost: 1.2.2
kmodes: 0.12.2
mlxtend: 0.22.0
statsforecast: 1.5.0
tune_sklearn: 0.5.0
ray: 2.8.0
hyperopt: 0.2.7
optuna: 3.4.0
skopt: 0.9.0
mlflow: 1.30.1
gradio: 3.50.2
fastapi: 0.104.1
uvicorn: 0.24.0.post1
m2cgen: 0.10.0
evidently: 0.2.8
fugue: 0.8.6
streamlit: Not installed
prophet: 1.1.5
I have followed various instructions from GitHub and Stack Overflow but haven't been able to resolve this issue. I need to run this code on Colab due to the resource-intensive nature of hyperparameter tuning.
What might be causing this problem and how can I solve it?