Getting error: KeyError: 'Only the Series name can be used for the key in Series dtype mappings.' when trying to do pandas Smote algorithm

Question

Getting error: KeyError: 'Only the Series name can be used for the key in Series dtype mappings.' when trying to do pandas Smote algorithm

11.5k views Asked by devdon At 15 December 2020 at 18:42

My data is slightly unbalanced, so I am trying to do a SMOTE algorithm before doing the logistic regression model. When I do, I get the error: KeyError: 'Only the Series name can be used for the key in Series dtype mappings.' Could someone help me figure out why? Here is the code:

X = dummies.loc[:, dummies.columns != 'Count']
y = dummies.loc[:, dummies.columns == 'Count']
#from imblearn.over_sampling import SMOTE
os = SMOTE(random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
columns = X_train.columns
os_data_X,os_data_y=os.fit_sample(X_train, y_train) # here is where it errors
os_data_X = pd.DataFrame(data=os_data_X,columns=columns )
os_data_y= pd.DataFrame(data=os_data_y,columns=['Count'])

Thank you!

Original Q&A

There are 4 answers

**Maxime** · Answer 1 · 2020-12-15T23:34:35+00:00

Maxime On 15 December 2020 at 23:34

I just encountered this problem myself. As it turned out, I had a duplicate column in my dataset. Perhaps double check that this is not the case for your dataset.

**devdon** · Answer 2 · 2020-12-16T18:11:08+00:00

devdon On 16 December 2020 at 18:11

I actually just fixed this problem! I made them matrices: os_data_X,os_data_y=os.fit_sample(X_train.as_matrix(), y_train.as_matrix())

**Muhammad Imran Zaman** · Answer 3 · 2021-03-23T06:19:43+00:00

Muhammad Imran Zaman On 23 March 2021 at 06:19

100% correct solution.

Try to convert your X features into an array first and then feed to SMOTE:

sm = SMOTE()

X=np.array(X)

X, y = sm.fit_sample(X, y.ravel())

**Beta Ways** · Answer 4 · 2022-10-01T10:08:29+00:00

Beta Ways On 01 October 2022 at 10:08

This error is mainly due to the fact that you have duplicate columns in your data. To check for duplicate columns, use:

df.head()

or df.columns

To fix, drop columns using:

df.drop('column_name', axis=1, inplace=True)

to drop the duplicated column(s).

TechQA.

Getting error: KeyError: 'Only the Series name can be used for the key in Series dtype mappings.' when trying to do pandas Smote algorithm

There are 4 answers

Related Questions in PYTHON

Related Questions in PANDAS

Related Questions in SMOTE

Popular Questions

Trending Questions