Categorical features with many unique values in a machine learning model

37 views Asked by At

I have a dataset. Most of the dataset are categorical features. I need to predict a column named ‘money’. That column is in numbers. Now i know that in order to put the categorical features into the model i need to change them to numbers and for that one might use ‘one hot encoding’. However, most of the categorical features contain many unique values. Some features include 200 unique values. One hot encoding becomes a big problem here in such a case since we will end up having a separate column for each unique value. This will lead to the curse of dimensionality. The task requires that i use all features so i cant ignore such features with many unique values. How do i tackle such problem. What are the techniques or topics that can help me include all the features. How to deal with features that have many unique values.

I tried to look for techniques to change the categorical features into numerical features.

0

There are 0 answers