Reverse cumulative sum, by group in SAS

439 views Asked by At

I have a variable that is reported daily and cumulatively in which I want raw counts for. I found code for doing this here, however the data I have are grouped. When I use "by" the group within the datastep, it still doesn't appear to recognize to start over with the new group, and therefor gives me a negative number for the first instance in the proceeding group.

This is the dummy code for what I am doing:

data have;
    input group $ date :yymmdd10. count_cumu;
    format date ddmmyy10.;
    datalines;
A 2022-03-01 2
A 2022-03-02 8
A 2022-03-03 16
A 2022-03-04 22
B 2022-03-1 3
B 2022-03-2 8
B 2022-03-3 20
B 2022-03-4 26
run;

proc sort data=have;
    by group date;
run;

data want;
    set have;
    count_raw = coalesce(dif(count_cumu), count_cumu);
    by group;
run;

Which outputs this:

output have

But I want the output to look like this (no negative for first instance in proceeding groups):

output want

Thank you for any advice or code you may be able to share!

2

There are 2 answers

3
data _null_ On BEST ANSWER
data want;
   set have;
   by group;
   count_raw = coalesce(dif(count_cumu), count_cumu);
   if first.group then count_raw = count_cumu;
   run;
0
Negdo On

You can also do this using proc SQL. I don't know why you would want to do it this way, and someone who is not completely new in SAS(unlike me) could probably explain why data step solution is better(and how to improve this proc SQL one). But here you go:

ods listing close;
ods output sql_results=num_have;
proc sql number;
    select *
        from work.have
        group by group
        order by group, date
    ;
quit;
ods listing;
proc sql;
    create table want as
        select *, (select count_cumu from num_have where row=a.row-1) as lag, 
        case when count_cumu < calculated lag then count_cumu 
        when calculated lag= . then count_cumu
        else (count_cumu - calculated lag ) end as count_raw
            from num_have as a
    ;
quit;