How to avoid overfitting (Encog3 C#)?

359 views Asked by At

I am new to neural network and I'm working with Encog3. I have created feedforward neural network which can be train and tested. Problem is that I'm not sure how to prevent overfitting. I know I have to split data into training, testing and evaluation set, but I'm not sure where and when to use evaluation set. Currently, I split all data into training and testing set (50%, 50%), train network on one part, test on another. Accuracy is 85%. I tried with CrossValidationKFold but in that case accuracy is only 12% and I don't understand why.

My question is, how can I use evaluation set to avoid overfitting? I am confused about evaluation set and any help would be appreciated.

2

There are 2 answers

0
Yuriy Zaletskyy On

It is general practice to have split 60x20x20 ( another common usage is 80x10x10 )%. 60 percent for training. 20 percent for validating and another 20 percent for validating previous two. Why three parts? Because it will give you better picture how ML works on data which it never seen before. Another part of analysis could include representative learning set. If you have in your training data set values which do not have any representation in validating then most probably you'll get mistakes in your ML. It's the same way how your brain works. If you learn some rules, and then suddenly got some task which is actually exception from rules you'll know, most probably you'll give wrong answer. In case if you have problems with learning, you can do the following: increase dataset, increase number of inputs ( via some non linear transformations with your inputs ). Maybe you'll also need to apply some anomaly detection algorithm. Also you can consider to apply some different normalization techniques.

0
roganjosh On

As a quick aside, you keep referring to the data as an “evaluation” set. Whilst it is being used in that capacity, the general term is “validation” set, which might allow you better success when googling it.

You’re in something of a chicken-and-egg situation with your current setup. Basically, the sole purpose of the validation set is to prevent overfitting – making no use of a validation set will (for all intents and purposes) result in overfitting. By contrast, the testing set has no part to play in preventing overfitting, it’s just another way of seeing, at the end, whether overfitting might have occurred.

Perhaps it would be easier to take this away from any maths or code (which I assume you have seen before) and imagine this as questions the model keeps asking itself. On every training epoch, the model is desperately trying to reduce its residual error against the training set and, being so highly non-linear, there’s a good chance in structured problems that it will reduce this error to almost nothingness if you allow it to keep running. But that’s not what you’re after. You’re after a model that is a good approximator for all three datasets. So, we make it do the following on every epoch:

“Has my new move reduced the error on the training set?” If yes: “Awesome, I’ll keep going in that direction.”
“Has my new move reduced the error on the validation set?” If yes: “Awesome, I’ll keep going in that direction.”

Eventually, you’ll come to:
“Has my new move reduced the error on the training set?” Yes: “Awesome, I’ll keep going in that direction.”
“Has my new move reduced the error on the validation set?” No, it’s increased: “Perhaps I’ve gone too far.”

If the validation error continues to rise, then you’ve identified the point at which your model is moving away from being a good approximator and moving towards being over-fit to the training set. It’s time to stop. Then you want to apply that final model to your test data and see whether the model is still a good approximator to that data too. And if it is, you have your model.

A final word, it’s good to see you’re doing some form of cross validation because I’ve seen that kind of safeguard missed so many times in the past.