I am working on a dataset with few missing values marked as "?", I have to replace them with the most common value(mode) of that column. But, I want to write a code which runs it for the whole dataset at once.
I have gotten so far -
df <- read.csv("mushroom.txt", na.strings = "?",header=FALSE)
Now, trying to replace all the NA values in the file with the mode of that column. Please help.
Since you want to replace by the mode of a column you want to operate in a column-wise fashion via apply and use
is.na
to identify those columns that you want to replace.Note that
apply
returns amatrix
, so if you want adata.frame
you would need to convert withas.data.frame