Fill in missing values with mode in SAS

1.5k views Asked by At

I think the logic to replace missingness is quite clear but when I dump it to SAS I find it too complicated to start with.

1

There are 1 answers

0
JJFord3 On BEST ANSWER

Given no code was provided, I'll give you some rough directions to get you started, but put it on you to determine any specifics.

First, lets create a month column for the data and then calculate the modes for each key for each month. Additionally, lets put this new data in its own dataset.

data temp;
   set original_data;
   month = month(date);
run;

proc univariate data=temp modes;
   var values;
   id key month;
   out=mode_data;
run;

However, this procedure calculates the mode in a very specific way that you may not want (defaults to the lowest in the case of a tie and produces no mode if nothing occurs at least twice) Documentation: http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_univariate_sect027.htm

If that doesn't work for you, I recommend using proc sql to get a count of each key, month, value combination and calculating your own mode from there.

proc sql;
   create table mode_data as select distinct
   key, month, value, count(*) as distinct_count
   from temp
   group by key, month, value;
quit;

From there you might want to create a table containing all months in the data.

proc sql;
   create table all_months as select distinct month
   from temp;
quit;

Don't forget to merge back in any missing months from to the mode data and use the lag or retain functions to search previous months for "old modes".

Then simply merge your fully populated mode data back to the the temp dataset we created above and impute the missing values to the mode when value is missing (i.e. value = .)

Hope that helps get you started.