Logistic Regression implementation with MNIST - not converging?

3.5k views Asked by At

I hope someone can help me. I did an implementation of logistic regression from scratch (so without library, except numpy in Python).

I used MNIST dataset as input, and decided to try (since I am doing binary classification) a test on only two digits: 1 and 2. My code can be found here

https://github.com/michelucci/Logistic-Regression-Explained/blob/master/MNIST%20with%20Logistic%20Regression%20from%20scratch.ipynb

The notebook should run on any system that have the necessary library installed.

Somehow my cost function is not converging. I am getting error since my A (my sigmoid) is getting equal to 1, since z is getting very big.

I tried everything but I don't see my error. Can anyone give a look and let me know if I missed something obvious? The point here is not getting a high accuracy. Is getting the model to converge to something ;)

Thanks in advance, Umberto

2

There are 2 answers

1
Clock Slave On

I read your codes. All looks fine. Only thing is that your learning rate is high. I know 0.005 is a small number but in this case its too high for the algorithm to converge. That is evident by the increase in cost. The cost decreases for a while and then starts going negative very quickly. The idea is to have cost close to zero. Here negative numbers do not imply smaller cost. You have to see the magnitude. I used 0.000008 as the learning rate and it works fine.

1
Umberto On

I got the error. The problem was that I used as class labels 1 and 2 (the one you can find in MNIST), but in binary classification you compare those values with 0 and 1, so the model could not converge, since sigmoid() (see my code) can only go from 0 to 1 (is a probability).

Using 0 and 1 instead of 1 and 2 solved the problem beatifully. Now my model converges to 98% accuracy :-)

Thanks everyone for helping!

Regards, Umberto