SAS create a table only if there is data available

43 views Asked by At

I have some data in SAS that I am performing QA on. I know I can output data to different tables using IF statements etc. What I want to do is output data to a table called 'error_data' if it fails a check; however I don't want the errors table to be created if there is nothing in it.

e.g.

data good_data1 error_data1;
  set sashelp.cars;
  if drivetrain in("Front", "Rear", "All") then output good_data1;
  else output error_data1;
run;

In this example, I want the table 'error_data' to either not be created, or deleted as there are no results in it.

The data I'm using will perform this sort of check over multiple different tables using a macro, so there will be multiple versions of 'good_data' and 'error_data'. I only want to see the 'error_data' table in the 'Output Data' list if there's something in it so that I don't have to click through every version of it to see if there's any items in them.

This processed will be wrapped in a macro, so if another step is required to remove 'error_data' from the 'Output Data' list if it contains no observations, then that would be fine.

2

There are 2 answers

0
Stu Sztukowski On

The dataset will be created whether there are errors or not, but you can easily delete it after the fact by checking if the number of observations is 0. You can create a macro function that does this for any given dataset.

%macro delete_data(data);
    %let dsid = %sysfunc(open(&data));
    %let nobs = %sysfunc(attrn(&dsid, nlobs));
    %let rc   = %sysfunc(close(&dsid));

    %if(&nobs = 0) %then %do;
        %put NOTE: &data has no observations and will be deleted.;
        proc delete data=&data;
        run;
    %end;
%mend;

data good_data1 error_data1;
  set sashelp.cars;
  if drivetrain in("Front", "Rear", "All") then output good_data1;
  else output error_data1;
run;

%delete_data(error_data1);
NOTE: error_data1 has no observations and will be deleted.

NOTE: Deleting WORK.ERROR_DATA1 (memtype=DATA).
NOTE: PROCEDURE DELETE used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
0
Tom On

This method does not require any macro logic.

If you don't mind adding an extra variable to track whether you wrote any observations (risk is that the extra variable's name might conflict with an existing variable) then you can conditionally run CALL EXECUTE() to push the code to remove the dataset.

data good_data1 error_data1;
  if eof and not anybad then call execute("proc delete data=error_data1;run;");
  set sashelp.cars end=eof;
  if drivetrain in("Front", "Rear", "All") then output good_data1;
  else do;
    anybad+1;
    output error_data1;
  end;
  drop anybad;
run;

Example

1    data good_data1 error_data1;
2      if eof and not anybad then call execute("proc delete data=error_data1;run;");
3      set sashelp.cars end=eof;
4      if drivetrain in("Front", "Rear", "All") then output good_data1;
5      else do;
6        anybad+1;
7        output error_data1;
8      end;
9      drop anybad;
10   run;

NOTE: The data set WORK.GOOD_DATA1 has 428 observations and 15 variables.
NOTE: The data set WORK.ERROR_DATA1 has 0 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


NOTE: CALL EXECUTE generated line.
1   + proc delete data=error_data1;run;

NOTE: Deleting WORK.ERROR_DATA1 (memtype=DATA).
NOTE: PROCEDURE DELETE used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


11   data good_data1 error_data1;
12     if eof and not anybad then call execute("proc delete data=error_data1;run;");
13     set sashelp.cars end=eof;
14     if drivetrain in( "Rear", "All") then output good_data1;
15     else do;
16       anybad+1;
17       output error_data1;
18     end;
19     drop anybad;
20   run;

NOTE: The data set WORK.GOOD_DATA1 has 202 observations and 15 variables.
NOTE: The data set WORK.ERROR_DATA1 has 226 observations and 15 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds