Dealing with highly imbalanced datasets using Tensorflow Dataset and Keras Tuner

Question

Dealing with highly imbalanced datasets using Tensorflow Dataset and Keras Tuner

1k views Asked by rvdinter At 12 October 2020 at 09:31

I have a highly imbalanced dataset (3% Yes, 87% No) of textual documents, containing a title and abstract feature. I have transformed these documents into tf.data.Dataset entities with padded batches. Now, I am trying to train this dataset using Deep Learning. With model.fit() in TensorFlow, you have the class_weights parameter to deal with class imbalance, however, I am seeking for the best parameters using keras-tuner library. In their hyperparameter tuners, they do not have such an option. Therefore, I am seeking other options for dealing with class imbalance.

Is there an option to use class weights in keras-tuner? To add, I am already using the precision@recall metric. I could also try a data resampling method, such as imblearn.over_sampling.SMOTE, but as this Kaggle post mentions:

It appears that SMOTE does not help improve the results. However, it makes the network learning faster. Moreover, there is one big problem, this method is not compatible larger datasets. You have to apply SMOTE on embedded sentences, which takes way too much memory.

Original Q&A

There are 2 answers

**Malvika** · Answer 1 · 2020-10-12T10:13:27+00:00

Malvika On 12 October 2020 at 10:13

You could change the evaluation metric to fbeta_scorer.(its weighted fscore)

Or if the dataset is large enough, you can try undersampling.

**Praks** · Answer 2 · 2020-10-13T03:57:44+00:00

Praks On 13 October 2020 at 03:57

if you are looking for other methods to deal with imbalanced data, you may consider generating synthetic data using SMOTE or ADASYN package. This usually works. I see you have considered this as an option to explore.

TechQA.

Dealing with highly imbalanced datasets using Tensorflow Dataset and Keras Tuner

There are 2 answers

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in KERAS

Related Questions in IMBALANCED-DATA

Related Questions in KERAS-TUNER

Popular Questions

Popular Tags

Trending Questions