I'm conducting hyperparameter tuning of a neural network. I've tried a lot of manual tuning and continue to get quite poor predictive power over the dataset I have been using. I have been opting to use GridSearch to test all possible parameter combinations for my model.
Is something like this possible (see code below) or if there is a smarter/better approach to parameter tuning? The code is able to run; it takes some time of course but it does work.
I have no particular error, I'm just looking for some good insight to know if this is appropriate.
Dataframe Example:
sequence target expression
-AQSVPWGISRVQAPAAH-NRGLRGSGVKVAVLDTGI-STHPDLNI... 0.00 50.0
-AQQVPYGVSQIKAPALH-EQGYTGQNVKVAVIDTGIDSSHPDLKV... 0.46 42.0
-AQSVPWGIRRVQAPAAH-NRGLTGSGVKVAVLDTGI-STHPDLNI... 0.34 46.0
-AQTVPWGISRVQAPAAH-NRGLTGAGVKVSVLDTGI-STHPDLNI... 0.95 45.0
-AQSVPYGVSQIKAPALH-SQGYTGSNVKVAVIDTGIDSSHPDLKV... 0.60 50.0
Data shape: 3000 rows and 3840 features
Note that the feature number is high as all these sequences are one hot encoded.
Code:
'Hyperparameter Tuning for Neurons, Batch_Size, Epochs and Learning Rate'
def build_regressor(n_neurons=1, learning_rate=0.01):
regressor = Sequential()
regressor.add(Dense(n_neurons, activation = 'relu', input_shape = (x_train.shape[1],)))
#regressor.add(Dense(n_neurons, activation = 'relu'))
regressor.add(Dense(units=1))
optimizer = Adam(lr = learning_rate)
regressor.compile(optimizer= optimizer, loss='mean_squared_error', metrics=['mae','mse'])
return regressor
#Create Model
model = KerasRegressor(build_fn=build_regressor, verbose=0)
# define the grid search parameters
batch_size = [10, 25, 50, 100, 150]
epochs = [5, 10, 25, 50]
n_neurons = [1, 32, 64, 128, 256, 512]
learning_rate = [0.001, 0.01, 0.1, 0.2, 0.3]
param_grid = dict(batch_size=batch_size, epochs=epochs, n_neurons=n_neurons, learning_rate = learning_rate)
#implement grid_search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3, scoring = 'r2')
grid_result = grid.fit(x_train, y_train)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
print("%f (%f) with: %r" % (mean, stdev, param))
Grid Search CV always give optimal solution but takes longer time to execute. But there are some other hyperparameters techniques like RandomizedSearchCV which iterate only on selected points and you can even tune iteration in this but it does not always gives an optimal solution but it is time saving. But you have some other techniques like
which performs much better than RandomizedSearchCV and have a full chance of giving optimal solution.
What you can do is for smaller datasets, you can use GridSearchCV but for larger datasets always use either hyperopt or TPOT which is much better than RandomizedSearchCV.