scikit-learn preperation

48 views Asked by At

I am trying to use the scikit-learn package for semi supervised classification, I have a file with classes, instances and features but I am not sure how to prepare this file for scikit-learn. Could you give some guidelines for file preparation? The tutorial only provide instructions for uploading prepared data sets from machine learning repositories. Thank you!

1

There are 1 answers

2
joeln On

Scikit-learn directly supports special learning-oriented input formats, notably SVMLight. But in general, its input is a numpy array (when dense), which can be produced from a diverse range of data sources using other tools from the SciPy stack, notably scipy.io, and more pertinently in the case of a text file with columns, Pandas IO tools. You can likely use pandas.read_csv followed by pulling out, and dropping from the feature set, the target class column.