I need to make a SVM in weka to filter documents using Java

3.2k views Asked by At

I am an absolute beginner. Never made a classifier or anything in weka using Java I have used the interface before. Basically I am kind of lost I've looked at the filter class for weka and played around with it a little. My documents are text documents and I need to separate them into 2 categories.

I'm not sure how I define the categories or how I load the documents into an IDE to be classified

:-(

Any help/tutorials or pointers would be greatly appreciated.

2

There are 2 answers

0
Stina On BEST ANSWER

I found this java tutorial very helpful, although there are very few resources online available (that I have found)

http://www.cs.waikato.ac.nz/ml/weka/index_documentation.html

hope this helps

3
zengr On

Using weka for the first time is a pain, but you will need to go through it.

Also, I tried out weka, but I had to dump it due to JVM out of memory exceptions. I wrote my own small clustering algo using Ruby, it's performance was way better.

Any way, here is how to use SVM in WEKA:

  1. You can follow this tutorial of how to use SVM in weka: www.stat.nctu.edu.tw/~misg/WekaInC.ppt

  2. Now, you will need data in ARFF format (and I recommend you use this, as per my exp, it helps, data looks more structured from WEKA's prespective). So, you can do that using XML2ARFF-Converter which I wrote for my self. You can modify it to read text files and convert your text file to ARFF.