applying a loop to iteratively recode variables based on a macro variable (call symput/symget)

145 views Asked by At

I have a dataset that looks like this, I am going to call it data1:

data1

I want to recode the values for the ss, out1 and out2 variables when dv=1 to the values within each "var" group when dv=0. The outcome dataset I want looks like this, let's call it data2:

data2

I was able to get the result using the following code:

data _null_; 
    set data1;
     if dv=0 then do;
  if var=0 then call symput("out1_0",out1);
  if var=0 then call symput("out2_0",out2);
  if var=0 then call symput("ss_0",ss);

  if var=1 then call symput("out1_1",out1);
  if var=1 then call symput("out2_1",out2);
  if var=1 then call symput("ss_1",ss);

  if var=2 then call symput("out1_2",out1);
  if var=2 then call symput("out2_2",out2);
  if var=2 then call symput("ss_2",ss);

  if var=3 then call symput("out1_3",out1);
  if var=3 then call symput("out2_3",out2);
  if var=3 then call symput("ss_3",ss);

  if var=4 then call symput("out1_4",out1);
  if var=4 then call symput("out2_4",out2);
  if var=4 then call symput("ss_4",ss);
     end;
run;

data data2; set data1;
    if dv=1 then do;
  if var=0 then out1=symget("out1_0");
  if var=0 then out2=symget("out1_0");
  if var=0 then ss=symget("ss_0");

  if var=1 then out1=symget("out1_1");
  if var=1 then out2=symget("out2_1");
  if var=1 then ss=symget("ss_1");

  if var=2 then out1=symget("out1_2");
  if var=2 then out2=symget("out2_2");
  if var=2 then ss=symget("ss_2");

  if var=3 then out1=symget("out1_3");
  if var=3 then out2=symget("out2_3");
  if var=3 then ss=symget("ss_3");

  if var=4 then out1=symget("out1_4");
  if var=4 then out2=symget("out2_4");
  if var=4 then ss=symget("ss_4");
    end;
 run;

I was wondering if there was a more efficient way to do it? For example, if the "var" variable has values from 0 to 20, instead of from 0 to 4 in this example, then I would want to use a loop to recode based on the number of levels for "var".

Any help is extremely appreciated, thank you!

1

There are 1 answers

1
Tom On

You shouldn't be using macro variables for this at all. You could just join the data with itself and select the appropriate value based on the value of DV.

proc sql ;
  create table data2 as 
    select a.var
         , a.dv
         , case when (a.dv=1) then b.out1 else a.out1 end as out1
         , case when (a.dv=1) then b.out2 else a.out2 end as out2
         , case when (a.dv=1) then b.ss else a.ss end as ss
    from data1 a 
    left join (select * from data1 where dv=0) b
    on a.var = b.var 
  ;
quit;

If you did want to generate a lot of macro variables you could save a lot of repetitive coding by generating the macro variable names dynamically from the value of VAR.

call symputx(cats('out1_',var),out1);
call symputx(cats('out2_',var),out2);
call symputx(cats('ss_',var),ss);