I am trying to predict whether a song is played using the open version or not using python, ski kit learn and the LinearSVC method.
My input data:
I already encoded the product column as 1s and 0s (1 if open 0 if not).
Things like context will have an impact on the product type. I was wondering if I need to make all of the categorical variables numerical for LinearSVC to handle them.
In general, turning categorical features into continuous features is a sub-optimal solution.
When using a support vector machine as a classifier (or even logistic regression), there should be no issue with handling categorical features that are 0-1 encoded. In cases where you have categorical features that cannot be converted to binary (e.g., your "context" column), I would recommend one-hot-encoding the data (see here first.
There might be a problem if there are too many unique entries for a particular feature. In that case, the one-hot-encoding will produce as many features as there are unique entries, which could be computationally expensive.