Understanding code wrt Logistic Regression using gradient descent

Question

Understanding code wrt Logistic Regression using gradient descent

129 views Asked by chi At 12 September 2017 at 13:29

I was following Siraj Raval's videos on logistic regression using gradient descent :

1) Link to longer video : https://www.youtube.com/watch?v=XdM6ER7zTLk&t=2686s

2) Link to shorter video : https://www.youtube.com/watch?v=xRJCOz3AfYY&list=PL2-dafEMk2A7mu0bSksCGMJEmeddU_H4D

In the videos he talks about using gradient descent to reduce the error for a set number of iterations so that the function converges(slope becomes zero). He also illustrates the process via code. The following are the two main functions from the code :

def step_gradient(b_current, m_current, points, learningRate):
    b_gradient = 0
    m_gradient = 0
    N = float(len(points))
    for i in range(0, len(points)):
        x = points[i, 0]
        y = points[i, 1]
        b_gradient += -(2/N) * (y - ((m_current * x) + b_current))
        m_gradient += -(2/N) * x * (y - ((m_current * x) + b_current))
    new_b = b_current - (learningRate * b_gradient)
    new_m = m_current - (learningRate * m_gradient)
    return [new_b, new_m]

def gradient_descent_runner(points, starting_b, starting_m, learning_rate, num_iterations):
    b = starting_b
    m = starting_m
    for i in range(num_iterations):
        b, m = step_gradient(b, m, array(points), learning_rate)
    return [b, m]

#The above functions are called below:
    learning_rate = 0.0001
    initial_b = 0 # initial y-intercept guess
    initial_m = 0 # initial slope guess
    num_iterations = 1000
    [b, m] = gradient_descent_runner(points, initial_b, initial_m, learning_rate, num_iterations)
# code taken from Siraj Raval's github page

Why does the value of b & m continue to update for all the iterations? After a certain number of iterations, the function will converge, when we find the values of b & m that give slope = 0.

So why do we continue iteration after that point and continue updating b & m ? This way, aren't we losing the 'correct' b & m values? How is learning rate helping the convergence process if we continue to update values after converging? Thus, why is there no check for convergence, and so how is this actually working?

Original Q&A

There are 2 answers

Lan On 12 September 2017 at 16:23

In practice, most likely you will not reach to slope 0 exactly. Thinking of your loss function as a bowl. If your learning rate is too high, it is possible to overshoot over the lowest point of the bowl. On the contrary, if the learning rate is too low, your learning will become too slow and won't reach the lowest point of the bowl before all iterations are done.

That's why in machine learning, the learning rate is an important hyperparameter to tune.

**chi** · Accepted Answer · 2017-09-12T15:54:00+00:00

chi On 12 September 2017 at 15:54 BEST ANSWER

Actually, once we reach a slope 0; b_gradient and m_gradient will become 0;

thus, for :

new_b = b_current - (learningRate * b_gradient)

new_m = m_current - (learningRate * m_gradient)

new_b and new_m will remain the old correct values; as nothing will be subtracted from them.

TechQA.

Understanding code wrt Logistic Regression using gradient descent

There are 2 answers

Related Questions in MACHINE-LEARNING

Related Questions in LOGISTIC-REGRESSION

Related Questions in GRADIENT-DESCENT

Related Questions in CONVERGENCE

Popular Questions

Popular Tags

Trending Questions