Fill Na in pandas with averages per another column

32 views Asked by At

I have a dataframe

df = pd.DataFrame({
    "species":["cat","dog","dog","cat","cat"],
    "weight":[5,4,3,7,None],
    "length":[12,None,13,14,15],
})
   species  weight  length
 0     cat     5.0    12.0
 1     dog     4.0     NaN
 2     dog     3.0    13.0
 3     cat     7.0    14.0
 4     cat     NaN    15.0

and I want to fill the missing data with the average for the species, i.e.,

df.loc[1,"length"] = 13   # the average dog length
df.loc[4,"weight"] =  6  # (5+7)/2 the average cat weight

How do I do that?

(presumably I need to pass value=DataFrame to df.fillna, but I don't see an easy way to construct the frame)

1

There are 1 answers

0
Asish M. On BEST ANSWER

df.fillna(df.groupby('species').transform('mean')) which returns

  species  weight  length
0     cat     5.0    12.0
1     dog     4.0    13.0
2     dog     3.0    13.0
3     cat     7.0    14.0
4     cat     6.0    15.0