Feed Forward Neural Network Always outputs Random but Similar Values

Question

Feed Forward Neural Network Always outputs Random but Similar Values

247 views Asked by Nicojwn At 16 April 2021 at 22:59

I recently coded a neural network based on this online book and Sebastian Lague's brief series on neural networks on youtube. I coded it as faithfully to the original as possible but it didn't end up working. I am trying to solve a simple XOR problem with it but it always seems to give me random but similar values. I even tried copying and pasting the author's code, without changing anything, but it still didn't work.

class NeuralNetwork:

    def __init__(self, layer_sizes, rate):
        weight_shapes = [(a,b) for a,b in zip(layer_sizes[1:], layer_sizes[:-1])]
        self.weights = [np.random.standard_normal(s)/s[1]**0.5 for s in weight_shapes]
        self.biases = [np.zeros((s,1)) for s in layer_sizes[1:]]
        self.rate = rate

    def predict(self, a):
        for w,b in zip(self.weights, self.biases):
            z = np.matmul(w,a) + b
            a = self.activation(z)
        return a

    def backprop(self, a, o):

        o = np.array(o)

        self.zCollection = []

        # Forward Propogation
        for w,b in zip(self.weights, self.biases):
            z = np.matmul(w,a) + b

            self.zCollection.append(z)

            a = self.activation(z)
        
        # Output error
        error =  (a - o) * self.activationPrime(self.zCollection[-1])

        self.weights[-1] += np.matmul(error, self.activation(self.zCollection[-2]).T) * self.rate
        self.biases[-1] += error * self.rate
        
        # Looping through layers
        for i in range(2, len(self.weights)):

            error = np.multiply(self.weights[-i+1].T * error,self.activationPrime(self.zCollection[-i]))

            self.weights[-i] = np.add(self.weights[-i], np.matmul(error, self.activation(self.zCollection[-i-1]).T) * self.rate)
            self.biases[-i] = np.add(self.biases[-i], error * self.rate)

    @staticmethod
    def activation(x):
        return 1/(1+np.exp(-x))
    @staticmethod
    def activationPrime(x):
        activation = lambda x : 1/(1+np.exp(-x))
        return activation(x) * (1 - activation(x))



if __name__ == "__main__":

    inp = [[0,0],[1,0],[0,1],[1,1]]
    out = [[0],[1],[1],[0]]

    # Reformating arrays
    inp = np.array([np.array(i) for i in inp])
    inp = np.array([i.reshape((len(i), 1)) for i in inp])
    out = np.array([np.array(i) for i in out])
    out = np.array([i.reshape((len(i), 1)) for i in out])


    layer_sizes = (2,2,1)
    nn = NeuralNetwork(layer_sizes, 0.001)

    print("start")
    for j in range(100):
        for i,o in zip(inp, out):
            nn.backprop(i, o)
    print("done")

    for i in inp:
        print(f"{[list(j) for j in i]} >> {nn.predict(i)[0,0]}")

I did some investigating myself and found that the update values for the weights were always small and constant for every iteration. I am not sure why but it looked like the weights weren't changing. I believe this may be the cause because when I set the seed at the beginning of the script the output values were incredibly similar to about 4dp, but i'm not sure. I tested the forward propagation so that cannot be the issue. I also tried randomizing the inputs, changing the learning rates, different layer sizes, and amounts. I also tried a different problem set which a perceptron could solve. That problem was to predict whether the sum of two numbers were greater than some other number. That didn't work either. When I graphed the output error over the epochs it looked like this. As you can see by the thick line the value is oscillating and seemingly decreasing. However, when I tested it it gave completely wrong results.

Here are some outputs that I am getting with different parameters:

learning rate : 100
layer_sizes : (2,2,1)
epochs : 10000

[[0], [0]] >> 1.70366026492168e-23
[[1], [0]] >> 4.876567289343432e-20
[[0], [1]] >> 2.4579325136292694e-24
[[1], [1]] >> 9.206132845755066e-21

learning rate : 1
layer_sizes : (2,5,5,1)
epochs : 10000

[[0], [0]] >> 0.9719657241512317
[[1], [0]] >> 0.9724187979341556
[[0], [1]] >> 0.9736236543859843
[[1], [1]] >> 0.9739884707274225

learning rate : 1
layer_sizes : (2,2,1)
epochs : 100

[[0], [0]] >> 0.3912836914268991
[[1], [0]] >> 0.49042088270977163
[[0], [1]] >> 0.4499482050352108
[[1], [1]] >> 0.5324205501065111

Original Q&A

There are 1 answers

**Nicojwn** · Accepted Answer · 2021-04-17T11:23:17+00:00

I seem to have fixed it. I made three main changes:

I switched the a and o in the output layer error calculation which then looked like this: error = (o - a) * self.activationPrime( self.zCollection[-1] ).

When updating the weights and biases I replaced

self.weights[-1] += np.matmul(error, self.activation(self.zCollection[-2]).T) * self.rate
self.biases[-1] += error * self.rate

with

self.weights[-1] = np.add(self.weights[-1], np.matmul(error, self.activation(self.zCollection[-2]).T) * self.rate)
self.biases[-1] = np.add(self.biases[-1], error * self.rate)

I did the same within the for loop. To see that code reference the code in the post.

These changes did not work with a small number of epochs though so I increased them to 100000 which worked. However, when decreasing the learning rate I had to increase the number of epochs again.

With these new parameters and changes I got the following example:

learning rate : 1
layer_sizes : (2,2,1)
epochs : 100000

[[0], [0]] >> 0.0024879823892047168
[[1], [0]] >> 0.9970151468472171
[[0], [1]] >> 0.996966687356631
[[1], [1]] >> 0.003029227917616288

I am pretty sure that these issues (if you can even call them that) have nothing to do with my code but are just a trait of feed-forward neural networks.

It took me a while but a found a 4th issue in the algorithm. In the 2nd for loop within the backprop method the error calculation is incorrect. The line should actually read error = np.multiply(np.matmul(self.weights[-i+1].T, error), self.activationPrime(self.zCollection[-i]))

TechQA.

Feed Forward Neural Network Always outputs Random but Similar Values

There are 1 answers

Related Questions in PYTHON

Related Questions in NEURAL-NETWORK

Related Questions in FEED-FORWARD

Popular Questions

Popular Tags

Trending Questions