I am training model using a VotingClassifier
in sklearn
.
The dataset is large, approximately 1 millions tabular rows.
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
log_clf = LogisticRegression(max_iter=10000)
rnd_clf = RandomForestClassifier(n_estimators=50)
svm_clf = SVC()
voting_clf = VotingClassifier(
estimators=[('lr', log_clf), ('rf', rnd_clf), ('svc', svm_clf)],
voting='hard'
)
I did some research and there are cuML
available for sklearn
to work on GPU.
But cuml
doesn't support VotingClassifier
yet.
Is there any other ways to train sklearn
model on GPU at the moment?
from cuml.ensemble import RandomForestClassifier as cuRFC
from sklearn.ensemble import VotingClassifier
from cuml.linear_model import LogisticRegression
from cuml.svm import LinearSVC
I tried to use voting classifier with cuML
models.
But it doesn't work.
By default it does not use GPU, especially if it is running inside Docker, unless you use nvidia-docker and an image with a built-in support. Scikit-learn is not intended to be used as a deep-learning framework and it does not provide any GPU support.