Validation loss is constant and training loss decreasing

1.6k views Asked by At

I have a model training and I got this plot. It is over audio (about 70K of around 5-10s) and no augmentation is being done. I have tried the following to avoid overfitting:

  • Reduce complexity of the model by reducing number of GRU cells and hidden dimensions.
  • Add dropout in each layer.
  • I have tried with higher dataset.

What I am not sure is if my calculation of training loss and validation loss is correct. It is something like this. I am using drop_last=True and I am using the CTC loss criterion.

train_data_len = len(train_loader.dataset)
valid_data_len = len(valid_loader.dataset)
epoch_train_loss = 0
epoch_val_loss = 0
train_losses = []
valid_losses = []

    model.train()
    for e in range(n_epochs):
        t0 = time.time()
        #batch loop
        running_loss = 0.0
        for batch_idx, _data in enumerate(train_loader, 1):
            # Calculate output ...
             # bla bla
            loss = criterion(output, labels.float(), input_lengths, label_lengths)
            loss.backward()
            optimizer.step()
            scheduler.step()
            # loss stats
            running_loss += loss.item() * specs.size(0)
                
        t_t = time.time() - t0

            
        ######################    
        # validate the model #
        ######################
        with torch.no_grad():
            model.eval() 
            tv = time.time()
            running_val_loss = 0.0
            for batch_idx_v, _data in enumerate(valid_loader, 1):
                #bla, bla
                val_loss = criterion(output, labels.float(), input_lengths, label_lengths)
                running_val_loss += val_loss.item() * specs.size(0)
        
            print("Epoch {}: Training took {:.2f} [s]\tValidation took: {:.2f} [s]\n".format(e+1, t_t, time.time() - tv))
                
                
        epoch_train_loss = running_loss / train_data_len
        epoch_val_loss = running_val_loss / valid_data_len
        train_losses.append(epoch_train_loss)
        valid_losses.append(epoch_val_loss)
        print('Epoch: {} Losses\tTraining Loss: {:.6f}\tValidation Loss: {:.6f}'.format(
                e+1, epoch_train_loss, epoch_val_loss))
        model.train()

enter image description here

0

There are 0 answers