I want to program a KI for checking if emails are spam or not. I am using deepNetts. I have a csv training set, which is very large. As you see in the bottom, I used deepNetts function ReadCsv to read the csv. But it only reads one column of a line, but I need it to read the whole line. How do I achieve this?
import java.io.IOException;
import javax.visrec.ml.data.DataSet;
import deepnetts.data.DataSets;
import deepnetts.data.preprocessing.scale.MaxScaler;
import deepnetts.net.FeedForwardNetwork;
import deepnetts.net.layers.activation.ActivationType;
import deepnetts.net.loss.LossType;
import deepnetts.net.train.BackpropagationTrainer;
public class Main {
public Main() {
// TODO Auto-generated constructor stub
}
public static void main(String[] args) throws IOException {
String csvFile = "emails.csv";
DataSet emailsDataSet= DataSets.readCsv(csvFile , 3000, 1, true);
DataSet[] trainAndTestSet = emailsDataSet.split(0.6, 0.4);
MaxScaler scaler = new MaxScaler(trainAndTestSet[0]);
scaler.apply(trainAndTestSet[0]);
scaler.apply(trainAndTestSet[1]);
FeedForwardNetwork neuralNet = FeedForwardNetwork.builder()
.addInputLayer(3000)
.addFullyConnectedLayer(50)
.addOutputLayer(1, ActivationType.SIGMOID)
.lossFunction(LossType.CROSS_ENTROPY)
.build();
neuralNet.getTrainer().setMaxError(0.03f)
.setMaxEpochs(10000)
.setLearningRate(0.001f);
neuralNet.train(trainAndTestSet[0]);
}
}
I get this error in the console:
java.lang.ArrayIndexOutOfBoundsException: Index 3000 out of bounds