Asking Root cause occurring the loss not zero but the gradient zero in tensorflow?

15 views Asked by At

I appreciate your insight.

    import os
    import re
    import time
    import subprocess as sub
    import tensorflow as tf
    import numpy as np
    from tensorflow.keras import optimizers
    
    Exp_Data =[[ ]] #Experiment Data
    
    def Main(x1,x2):        # Calculating error between calculation results and Experiments
        Mod_input(x1, x2)    #Expecting input Change with loss function and gradients value
        Run()                #Code Run with external executable file 
        time.sleep(1)        #Waiting for 30 sec while code is running.
        Output_analysis()    #Extract Calculation data.
        Read_extract()       #Data Trimming
        MAD = Error_cal()    #Calculating Mean Average Deviation(MAD) Cals and Experiment
        return MAD
    
    @tf.function
    def Main1(x1,x2):
        tf.print("x1 = ", x1, ", x2 = ", x2)
        return tf.py_function(Main, [x1, x2], Tout=tf.float32)
    
    opt = optimizers.Adam(learning_rate=0.001)
    
    # loss function 
    def custom_mean_squared_error(y_true, y_pred):
        return tf.reduce_mean(tf.square(y_pred - y_true))
    
    x = [tf.Variable(1.0, trainable=True, dtype=tf.float32) for _ in range(2)]
    
    with open(file_path, "w") as f:
        # Training loop here
        for i in range(3):
            with tf.GradientTape() as tape:
                loss = 100*custom_mean_squared_error(Main1(*x), tf.constant(0.0))
                tf.print("loss = ", loss)
    
            grads = tape.gradient(loss, x)
            tf.print("grads = ", grads)
    
            opt.apply_gradients(zip(grads, x))
    
            x[0].assign(x[0] - opt.learning_rate * grads[0])
            x[1].assign(x[1] - opt.learning_rate * grads[1])
    
        f.close()

(Information)

I confirmed that Main(x1, x2) function works well when it calculates error(MAD) between code calculation and experiment data.

(Problem)

The result of loss function is non-zero but result of gradients is always zero. Accordingly, tf.variables of x1, x2 are always same as followings.

(Current Result)

Iteration 1: x1 =  1 , x2 =  1, loss =  14.6070738, grads =  [0, 0]
Iteration 2: x1 =  1 , x2 =  1, loss =  14.6070738, grads =  [0, 0]
Iteration 3: x1 =  1 , x2 =  1, loss =  14.6070738, grads =  [0, 0]

(Previously)

I've been doing everything I can on the internet for 2 weeks, but the situation is getting worse and worse.I tried with modifications for dimension, No gradients, py_function, numpy(), SGD, function structure, learning rate, and etc.

(My Wish Result)

I expect x1, x2 to be changed for loss reduction as the iteration proceeds.

Iteration 1: x1 =  1 , x2 =  1, loss =  14.6070738, grads =  [1.1, 2.1]
Iteration 2: x1 =  1.11 , x2 =  1.21, loss =  4.6070738, grads =  [0.2, 0.3]
Iteration 3: x1 =  1.13 , x2 =  1.24, loss =  0.6070738, grads =  [0.0001, 0.0002]
0

There are 0 answers