Thanks in advance for the help.
I'm looking for a binary executable to convert an .arff into a .csv in a bash script. Ideally something that I could run along the lines of
#! /bin/sh
... some stuff....
converstionFunc input.arff output.csv
... some more stuff ...
Looking into writing this myself I found that weka provides a library that I could utilize that would allow me to do this. However, as much as I looked for it, I could not find it. I have weka installed on my mac and after looking around for the library I still was unable to find it.
Does anyone know where I may find such an executable, or able to point me where I could get a hold of the weka java library that would let me write it myself?
Clone this github repository. It contains an arff2csv tool in the "tools" subdirectory.
arff2csv is designed to run in pipes of unix commandline tools.
https://github.com/jeroenjanssens/data-science-at-the-command-line
arff2csv is a one-line shell-script that calls another shell script that calls weka.jar,
so it needs java installed on your machine; and note that arff2csv needs Weka version 3.6. (According to my experiments the newer v3.7 does not work.)
The script wants this environment variable set:
and then you can do
Large arffs need some time to get processed.
You can read J.Janssen's book (see repo-README) for a bit more info.