Trying to tune the hyperparameters of SVM using PSO but getting model accuracy very low

181 views Asked by At

I built a classification model using SVM and now trying to tune the parameters of SVM using PSO and PSO with passive congregation, however, the accuracy of the model is way too low. Below is the code and results.

def pso(n_particles, iterations, dimensions, inertia):

# Range of SVR's hyperparameters (Particles' search space)
# C, Epsilon and Gamma
max_c = 1e4
min_c = 1e-3
max_e = 1e-1
min_e = 1e-8
max_g = 1e3
min_g = 1e-3

# Initializing particles' positions randomly, inside
# the search space
x = np.random.rand(n_particles, 1)*(max_c - min_c) + min_c
#y = np.random.rand(n_particles, 1)*(max_e - min_e) + min_e
z = np.random.rand(n_particles, 1)*(max_g - min_g) + min_g
c = np.concatenate((x,z), axis=1)

# Initializing particles' parameters
v = np.zeros((n_particles, dimensions))
c1 = 1
c2 = 1
c3 = 2
p_best = np.zeros((n_particles, dimensions))
p_best_val = np.zeros(n_particles) + sys.maxsize  
g_best = np.zeros(dimensions)
g_best_val = sys.maxsize
best_iter = np.zeros(iterations)
R = np.random.rand(n_particles)

# Initializing regression variables
p_best_RGS = np.empty((n_particles), dtype = object);
g_best_RGS = sys.maxsize

 from sklearn.metrics import mean_squared_error, accuracy_score, hamming_loss, f1_score, roc_auc_score
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
for i in range(iterations):
    for j in range(n_particles):
      # Starting Regression
      clf = svm.SVC(kernel = 'rbf', C = c[j][0], gamma = c[j][1])
      #rgs1 = svm.SVC(C = c[j][0], gamma = c[j][2])
    
      # Fitting the curve
      clf.fit(x_train, y_train)
      y_predict = clf.predict(x_test)
      #print(y_predict)
      #y_pred= sc.fit_transform(y_predict)
      #print(y_pred)
      # Using Mean Squared Error to verify prediction accuracy
      #print(Y_test)
      mse = roc_auc_score(y_test, y_predict) 
      
      
      # If mse value for that search point, for that particle,
      # is less than its personal best point,
      # replace personal best
      if(mse < p_best_val[j]):   # mse < p_best_val[j]
          # The value below represents the current least Mean Squared Error
          p_best_val[j] = mse
          p_best_RGS[j] = clf
          # The value below represents the current search coordinates for
          # the particle's current least Mean Squared Error found
          p_best[j] = c[j].copy()
          
      # Using auxiliar variable to get the index of the
      # particle that found the configuration with the 
      # minimum MSE value
      aux = np.argmin(p_best_val)        
    
      if(p_best_val[aux] < g_best_val):
          # Assigning Particle's current best MSE to the Group's best    
          g_best_val = p_best_val[aux]

          # Assigning Particle's current best configuration to the Group's best
          g_best = p_best[aux].copy()

          # Group best regressor:
          # the combination of C, Epsilon and Gamma
          # that computes the best fitting curve
          g_best_RGS = p_best_RGS[aux]

    
          rand1 = np.random.random()
          rand2 = np.random.random()
          rand3 = np.random.random()

      # The variable below influences directly the particle's velocity.
      # It can either make it smaller or bigger. 
      w = inertia

      # The equation below represents Particle's velocity, which is
      # the rate of change in its position
      #v[j] = w*v[j] + c1*(p_best[j] - c[j])*rand1 + c2*(g_best - c[j])*rand2 + c3*(R[j] - c[j])*rand3
      v[j] = w*v[j] + c1*(p_best[j] - c[j])*rand1 + c2*(g_best - c[j])*rand2 
      # Change in the Particle's position 
      c[j] = c[j] + v[j]

      # Below is a series of conditions that stop the particles from
      # leaving the search space
      if(c[j][1] < min_g):
        c[j][1] = min_g
      if(c[j][1] > max_g):
        c[j][1] = max_g
      #if(c[j][1] < min_e):
        #c[j][1] = min_e
      #if(c[j][1] > max_e):
        #c[j][1] = max_e
      if(c[j][0] < min_c):
        c[j][0] = min_c
      if(c[j][0] > max_c):
        c[j][0] = max_c
        
 
    # The variable below represents the least Mean Squared Error
    # of the current iteration
    best_iter[i] = g_best_val
            
    print('Best value iteration # %d = %f\n'%(i, g_best_val))

Results The accuracy is lower than the untuned model and gamma values greater than 1.

Accuracy:

enter image description here

Hyperparameters:

enter image description here

2

There are 2 answers

1
maximdu On

My first thought is to make sure your PSO algorithm works properly.

Try replacing your code for PSO with another implementation.

I recommend PySwarms library.


Also, from my experience, most of the time grid search or random search for hyperparameters works okay and is easier to implement. Try using grid or random search and compare the results with PSO.

If there is no improvement, maybe you should focus on using different algorithm (linear model, decision tree, gradient boosting, ...) and/or adding new features. From my experience, careful feature selection beats hyperparameter tuning.

0
Nikaido On

The problem is due probably to the unbalanced dataset

As you can see the class 1 has only 1 sample (support class 1 = 1).

enter image description here

Now, I don't know if the results are of your test set, but to me it seems that to handle this problem it would be better to handle the unbalanced dataset somehow (there are a lot of methods to handle the problem, search on google)

Also, because your dataset is unbalanced, accuracy is not the metric to go. You should use metric more reliable. As start for example you could check the precision and recall singularly for each class to see how the model handle them.

To make this more clear, if your model would predict always the class 0 for every instance, your accuracy would be of almost 100% (you can make a dummy predictor that output always 0 to check this)

105/106 = 0.99