I am using GPflow with gaussian likelihood for time series prediction. The data comes from a stream. Every time a new batch of data arrives (1 point per batch in my case), I want to update/extend my model with the new datapoint without retraining from scratch. Is this possible?
Right now I am retraining from scratch everytime. How can this be optimized? The running time of my model is of crutial importance so any help is welcome:
predictions = [] # To store the sequential predictions
model = GPR(data=(X_train, y_train), kernel=kernel)
for i in range(0, len(X_test), prediction_steps):
# Train the model with the current training data
optimizer = gpflow.optimizers.Scipy()
start = time()
optimizer.minimize(model.training_loss, model.trainable_variables)
end = time()
print('Time taken for iteration ', i, ' : ', end-start)
# Make predictions for the next prediction_steps steps
next_predictions, _ = model.predict_y(X_test[i : i + prediction_steps])
# Store the predictions
predictions.extend(next_predictions.numpy())
# Update the training data with the true values for the next prediction_steps steps
X_train = np.vstack([X_train, X_test[i : i + prediction_steps]])
y_train = np.vstack([y_train, y_test[i : i + prediction_steps]])
# Update the model's data
model.data = (X_train, y_train)
(Each iteration yields the same running time)