I am using SMOTE algorithm from the python imbalanced-learn package:
from imblearn.over_sampling import SMOTE
sm = SMOTE(kind='regular', n_neighbors = 4)
:
X_train_resampled, y_train_resampled = sm.fit_sample(X_train, y_train)
I have explicitly set the n_neighbors = 4
. However, I got the following error from the above code:
ValueError Traceback (most recent call last)
<ipython-input-2-9e9116d71706> in <module>()
33
34 #try:
---> 35 X_train_resampled, y_train_resampled = sm.fit_sample(X_train, y_train)
36 #except:
37 #continue
/usr/local/lib/python3.4/dist-packages/imblearn/base.py in fit_sample(self, X, y)
176 """
177
--> 178 return self.fit(X, y).sample(X, y)
179
180 def _validate_ratio(self):
/usr/local/lib/python3.4/dist-packages/imblearn/base.py in sample(self, X, y)
153 self._validate_ratio()
154
--> 155 return self._sample(X, y)
156
157 def fit_sample(self, X, y):
/usr/local/lib/python3.4/dist-packages/imblearn/over_sampling/smote.py in _sample(self, X, y)
287 nns = self.nearest_neighbour.kneighbors(
288 X_min,
--> 289 return_distance=False)[:, 1:]
290
291 self.logger.debug('Create synthetic samples ...')
/usr/local/lib/python3.4/dist-packages/sklearn/neighbors/base.py in kneighbors(self, X, n_neighbors, return_distance)
341 "Expected n_neighbors <= n_samples, "
342 " but n_samples = %d, n_neighbors = %d" %
--> 343 (train_size, n_neighbors)
344 )
345 n_samples, _ = X.shape
ValueError: Expected n_neighbors <= n_samples, but n_samples = 5, n_neighbors = 6
Any idea why my settings of n_neighbors = 4
doesn't work?
The correct parameter is:
You are informing n_neighbors with n, but the correct is k_neighbors, with k!
The message is because 5 is the default.
Read the docs here.