How to continue training DETR model from last epoch using checkpoints?

Question

How to continue training DETR model from last epoch using checkpoints?

153 views Asked by user22874419 At 07 November 2023 at 16:04

How can I continue training DETR with checkpoints from last epoch? I am using google colab and I can't train on 200 epoch all at once. This is my code for training:

from pytorch_lightning.callbacks import EarlyStopping
from pytorch_lightning import Trainer

# Define your DETR model, dataset, and other necessary elements
MAX_EPOCHS = 200

early_stopping_callback = EarlyStopping(
    monitor='training_loss',  # Monitor validation AP
    min_delta=0.00,  # Minimum change in AP
    patience=3,  # Number of epochs to wait for improvement before stopping
    mode='max'  # Consider AP as a maximization metric
)

trainer = Trainer(
    devices=1,
    accelerator="gpu",
    max_epochs=MAX_EPOCHS,
    gradient_clip_val=0.1,
    accumulate_grad_batches=8,
    log_every_n_steps=5,
    callbacks=[early_stopping_callback]
)

trainer.fit(model)

I tried this code to call the last checkpoint and the model but it didn't work.

Original Q&A

There are 1 answers

**Anna Andreeva Rogotulka** · Answer 1 · 2023-11-09T22:08:15+00:00

you can load latest checkpoint into model

model.load_state_dict(checkpoint['model'])

and don't forget restore the states of optimizer and scheduler for stable learning

optimizer.load_state_dict(checkpoint['optimizer'])
lr_scheduler.load_state_dict(checkpoint['lr_scheduler'])
start_epoch = checkpoint['epoch'] + 1

TechQA.

How to continue training DETR model from last epoch using checkpoints?

There are 1 answers

Related Questions in PYTHON

Related Questions in PYTORCH

Related Questions in ARTIFICIAL-INTELLIGENCE

Related Questions in PYTORCH-LIGHTNING

Related Questions in CHECKPOINT

Popular Questions

Popular Tags

Trending Questions