MCA() returns "array must not contain infs or nans"

1k views Asked by At

I have a dataframe of ones and zeros which act as metadata to describe the properties of some features of the main dataset. As part of a data exploration I was running the following code on the dataframe, to express those features tags into a 2D plot.

mca = prince.MCA()
mca_mtx = mca.fit(tags_df).transform(tags_df) 

But I am getting during the fit the following error:

array must not contain infs or nans

After inspecting the dataframe I see there are no infs or nans in the entire dataset. So the problem must be something else.

Anyone idea how to solve this?

2

There are 2 answers

0
Ignacio Alorre On

Apparently it is a known bug. The problem is in the values of the dataframe tags_df, since 1.0 and 0.0 are producing inf or nan during the mca algorithm.

I tried changing those 1.0 and 0.0 by True and False (bool type) without success. However, the string version did the trick, that is "True" and "False". So the following line solved my problem:

tags_df.replace({0: "False", 1: "True"}, inplace = True)
0
ruben On

I spend many hours on this error, and luckily found a solution. I've changed the dtype from "category" to "object", using ".astype('object')". Also I use the following code:

    mca = prince.MCA(
    n_components=3,
    n_iter=3,
    copy=True,
    check_input=True,
    engine='sklearn',
    random_state=42
) 
    mca = mca.fit(df_final_2)
    mca.plot(
       df_final_2,
       x_component=0,
       y_component=1
            )

Hope it helps!