How can I continue training DETR with checkpoints from last epoch? I am using google colab and I can't train on 200 epoch all at once. This is my code for training:
from pytorch_lightning.callbacks import EarlyStopping
from pytorch_lightning import Trainer
# Define your DETR model, dataset, and other necessary elements
MAX_EPOCHS = 200
early_stopping_callback = EarlyStopping(
monitor='training_loss', # Monitor validation AP
min_delta=0.00, # Minimum change in AP
patience=3, # Number of epochs to wait for improvement before stopping
mode='max' # Consider AP as a maximization metric
)
trainer = Trainer(
devices=1,
accelerator="gpu",
max_epochs=MAX_EPOCHS,
gradient_clip_val=0.1,
accumulate_grad_batches=8,
log_every_n_steps=5,
callbacks=[early_stopping_callback]
)
trainer.fit(model)
I tried this code to call the last checkpoint and the model but it didn't work.
you can load latest checkpoint into model
and don't forget restore the states of optimizer and scheduler for stable learning