sklearn.ensemble.AdaBoostClassifier cannot accecpt SVM as base_estimator?

22.9k views Asked by At

I am doing a text classification task. Now I want to use ensemble.AdaBoostClassifier with LinearSVC as base_estimator. However, when I try to run the code

clf = AdaBoostClassifier(svm.LinearSVC(),n_estimators=50, learning_rate=1.0,    algorithm='SAMME.R')
clf.fit(X, y)

An error occurred. TypeError: AdaBoostClassifier with algorithm='SAMME.R' requires that the weak learner supports the calculation of class probabilities with a predict_proba method

The first question is Cannot the svm.LinearSVC() calculate the class probabilities ? How to make it calculate the probabilities?

Then I Change the parameter algorithm and run the code again.

clf = AdaBoostClassifier(svm.LinearSVC(),n_estimators=50, learning_rate=1.0, algorithm='SAMME')
clf.fit(X, y)

This time TypeError: fit() got an unexpected keyword argument 'sample_weight' happens. As is said in AdaBoostClassifier, Sample weights. If None, the sample weights are initialized to 1 / n_samples. Even if I assign an integer to n_samples, error also occurred.

The second question is What does n_samples mean? How to solve this problem?

Hope anyone could help me.

According to @jme 's comment, however, after trying

clf = AdaBoostClassifier(svm.SVC(kernel='linear',probability=True),n_estimators=10,  learning_rate=1.0, algorithm='SAMME.R')
clf.fit(X, y)

The program cannot get a result and the memory used on the server keeps unchanged.

The third question is how I can make AdaBoostClassifier work with SVC as base_estimator?

4

There are 4 answers

0
kevin On BEST ANSWER

The right answer will depend on exactly what you're looking for. LinearSVC cannot predict class probabilities (required by default algorithm used by AdaBoostClassifier) and does not support sample_weight.

You should be aware that the Support Vector Machine does not nominally predict class probabilities. They are computed using Platt scaling (or an extension of Platt scaling in the multi-class case), a technique which has known issues. If you need less "artificial" class probabilities, an SVM might not be the way to go.

With that said, I believe the most satisfying answer given your question would be that given by Graham. That is,

from sklearn.svm import SVC
from sklearn.ensemble import AdaBoostClassifier

clf = AdaBoostClassifier(SVC(probability=True, kernel='linear'), ...)

You have other options. You can use SGDClassifier with a hinge loss function and set AdaBoostClassifier to use the SAMME algorithm (which does not require a predict_proba function, but does require support for sample_weight):

from sklearn.linear_model import SGDClassifier

clf = AdaBoostClassifier(SGDClassifier(loss='hinge'), algorithm='SAMME', ...)

Perhaps the best answer would be to use a classifier that has native support for class probabilities, like Logistic Regression, if you wanted to use the default algorithm provided for AdaBoostClassifier. You can do this using scikit.linear_model.LogisticRegression or using SGDClassifier with a log loss function, as used in the code provided by Kris.

Hope that helps, if you're curious about what Platt scaling is, check out the original paper by John Platt here.

0
ReneWang On

Actually, LinearSVC can be applied to AdaBoostClassifier without rescaling SVC output through Platt scaling and that is AdaBoost.M1 algorithm was initially designed for, a classifier take {-1, 1} as output. The default algorithm choice in AdaBoostClassifier is AdaBoost.SAMME algorithm [2](specifying "SAMME.R" in algorithm keyword argument) which is designed for multi-class classification.

However, your LinearSVC AdaBoost will be unable to provide predict_proba. On the other side, if what you want is to keep the sign in the output instead of fitting SVM output into sigmoid curve to provide probability. Then you change the algorithm from SAMME.R to SAMME is the easit way to do.

[1] Y. Freund, R. Schapire, “A Decision-Theoretic Generalization of on-Line Learning and an Application to Boosting”, 1995.
[2] Zhu, H. Zou, S. Rosset, T. Hastie, “Multi-class AdaBoost”, 2009

0
graham On

You need to use a learner that has the predict_proba method, since this isn't available in LinearSVC, try SVC with kernel set to 'linear'

clf = AdaBoostClassifier(svm.SVC(probability=True,kernel='linear'),n_estimators=50,       learning_rate=1.0, algorithm='SAMME')
clf.fit(X, y)

while I'm not sure if this will yield identical results to LinearSVC, from the documentation it says:

Similar to SVC with parameter kernel=’linear’, but implemented in terms of liblinear rather than libsvm, so it has more flexibility in the choice of penalties and loss functions and should scale better (to large numbers of samples).

Also mentions something about One vs All and One vs One in terms of how they differ.

1
Kris On

I just had a similar issue trying to use AdaBoostClassifier with LogisticRegression. The docs mention that the weak classifier (or base_estimator) must have a fit method that takes the optional sample_weight=... keyword argument, cf. question #18306416.

If you do want to use an SVM or logistic regression with AdaBoost, you use sklearn's stochastic gradient descent classifier with loss='hinge' (svm) or loss='log' (logistic), e.g.

from sklearn.linear_model import SGDClassifier
from sklearn.ensemble import AdaBoostClassifier

clf = AdaBoostClassifier(SGDClassifier(loss='log'), ...)

YMMV