I am new to neural network and I'm working with Encog3. I have created feedforward neural network which can be train and tested. Problem is that I'm not sure how to prevent overfitting. I know I have to split data into training, testing and evaluation set, but I'm not sure where and when to use evaluation set. Currently, I split all data into training and testing set (50%, 50%), train network on one part, test on another. Accuracy is 85%. I tried with CrossValidationKFold but in that case accuracy is only 12% and I don't understand why.
My question is, how can I use evaluation set to avoid overfitting? I am confused about evaluation set and any help would be appreciated.
It is general practice to have split 60x20x20 ( another common usage is 80x10x10 )%. 60 percent for training. 20 percent for validating and another 20 percent for validating previous two. Why three parts? Because it will give you better picture how ML works on data which it never seen before. Another part of analysis could include representative learning set. If you have in your training data set values which do not have any representation in validating then most probably you'll get mistakes in your ML. It's the same way how your brain works. If you learn some rules, and then suddenly got some task which is actually exception from rules you'll know, most probably you'll give wrong answer. In case if you have problems with learning, you can do the following: increase dataset, increase number of inputs ( via some non linear transformations with your inputs ). Maybe you'll also need to apply some anomaly detection algorithm. Also you can consider to apply some different normalization techniques.