TanH(x) somehow magically becomes bigger 1 and bigger 1 Million

124 views Asked by At

After creating a following Neural Network:

nn = new BasicNetwork();
     nn.addLayer(new BasicLayer(null, true, 29));
     nn.addLayer(new BasicLayer(new ActivationReLU(), true, 1000));
     nn.addLayer(new BasicLayer(new ActivationReLU(), true, 100));
     nn.addLayer(new BasicLayer(new ActivationReLU(), true, 100));
     nn.addLayer(new BasicLayer(new ActivationTANH()  ,false, 4));

     nn.getStructure().finalizeStructure();
     nn.reset();

I experienced a mistake bigger 10^38. This is completely insane. Therefore I coded the error function by myself and noticed that the error still was that big. I first checked my IdealOutputs and noticed they were all in the range -1 to 1. The calculated Outputs though were way bigger than 1. Therefore I conclude a floating point error.

Am I correct with my conclusion? What can I do to avoid such stupid, time-consuming mistakes the next time?

Sincerely

Edit:

nn = new BasicNetwork();
     nn.addLayer(new BasicLayer(null, true, 29));
     nn.addLayer(new BasicLayer(new ActivationSigmoid(), true, 1000));
     nn.addLayer(new BasicLayer(new ActivationSigmoid(), true, 100));
     nn.addLayer(new BasicLayer(new ActivationSigmoid(), true, 100));
     nn.addLayer(new BasicLayer(new ActivationTANH()  ,false, 4));

     nn.getStructure().finalizeStructure();
     nn.reset();

The Problem still occurs after using Sigmoid functions. How to fix this?

1

There are 1 answers

0
Shubham Panchal On
- Write using a very smaller learning rate like 0.0001 or even smaller.
- Randomly initialize the weights.
- Initialize the biases as 1 initially.
- Try using Batch Normalization 

The ReLU function actually cannot squeeze the values because the numbers being positive it acquires the y = x. Due to increasing gradients, this values goes on becoming greater.