I have a csv file containing 24231 rows. I would like to apply LOOCV based on the project name instead of the observations of the whole dataset. So if my dataset contains information for 15 projects, I would like to have the training set based on 14 projects and the test set based on the other project.
I was relying on weka's API, is there anything that automates this process?
For non-numeric attributes, Weka allows you to retrieve the unique values via
Attribute.numValues()(how many are there) andAttribute.value(int)(the -th value).With Weka's anneal UCI dataset and the
surface-qualityfor leave-one-out, you can generate something like this: