Incrementing a field value, conditionally, over data cycles - is there a way w/ just SQL?

1.8k views Asked by At

Is there a way to get a conditional increment based on cycles in data, and carry it forward in a query field? This is for Oracle, 11G2.

I have monthly activity data (40+ years) for individuals, tracking changes to a membership. Individuals can join the service, remain for an indefinite period of time, quit, and rejoin later. The changes have a specific code for when they leave the service. I am trying to find a way to follow these quit/ rejoin cycles for each individual, and increment a column value, showing how many times they've been in the service, and associating the history records during the period.

Using analytic functions, I can detect their first appearance, and when they leave, and when they're BACK in after leaving. I haven't figured out a way to use this toincrement a column and carry that value forward until the end for that individual.

It's not a sequence, because each individual starts at "1". Row number hasn't worked for me for the logic/partition combinations that I've tried. I've tried subqueries with CASE statements on the change code and then using LAST_VALUE to carry those forward in the outer query - but I'm just not finding a way to get the right increment or carry it forward. I just not getting it.

I have fiddled this, with the core query that I've started with.
http://sqlfiddle.com/#!4/65d49/1/0

select recno, uniq, 
   row_number() over (PARTITION by uniq 
                      order by sym , mchty ) histrec,
   sym, 
   mchty, 
   lag( mchty, 1, '99')  over ( PARTITION by uniq 
                                  order by sym ) premchty, 
   (case
      when lag( mchty, 1, '99') over ( PARTITION by uniq 
                                       order by sym  ) = '99'
          then 1
      end ) join_svc,
   (case
      when  lag( mchty, 1, '99') over ( PARTITION by uniq 
                                        order by sym  ) = '6'
          then 1
      end ) rejoin_svc,
      svc svc_num
 from demo_history ;
  • RecNo - the record number from the source table.

    UNIQ - the unique identifier for an individual.

    SYM - the date of a change, an ALPHANUMERIC date string, 'YYYYMM' - I handle some invalid months in other code.

    MCHTY - the activity Change code. Activity codes can be '0' to '6', with '6' being the "leaves the service" indicator.

The records are ordered by the date of the change, then by the type of the change.

In the example query results,

  • HISTREC - the row number for an individual history

    PREMCHTY - the most recent Change code (Lag)

    JOIN_SVC - first record in a set

    REJOIN_SVC - first record after leaving

    SVC (or SVC_NUM in the example) is what I'm trying to generate - the "nth" time in the service as I think it should be.

How can I get query/calculate/generate the contents of the SVC field, incrementing it for each new period of time, after an individual leaves ('6')?

Ultimately, the unique ID and incremented svc number combination will be used to create a master record for each time an individual is in the service.

Background: I'm trying to replace a lot of procedural code and logic left over from PL/1 procedural days, but "updated" by putting it in PL/SQL procedures with multi-nested cursors, and sharing field values across a set of records through IN/OUT parameters.
Data size is about 5 millions of records, about 270k individual IDs. I'd like to handle the individual's history records as a set, using SQL to replace most of the field transformations. If I'm approaching this wrong, or if there's a better way, then let me know.

2

There are 2 answers

1
Noel On BEST ANSWER

You are close to the solution. You just need to use SUM as analytic function over the rejoin_svc column. But this will give you svc numbers starting from 0. So, just add 1.

select recno,
uniq,
sym,
mchty,
sum(rejoin_svc) over (PARTITION by uniq order by sym) + 1 svc_num,
svc
from
(
  select recno, uniq, 
       sym, 
       mchty, 
       (case
          when  lag( mchty, 1, '99') over ( PARTITION by uniq 
                                            order by sym  ) = '6'
              then 1
              else 0
          end ) rejoin_svc,
        svc
from demo_history
)
order by recno, uniq, svc;

sqlfiddle

2
Ponder Stibbons On

Recursive solution:

with t as (
    select recno, uniq, sym, mchty, svc, 
        row_number() over (partition by uniq order by sym, recno) rn
      from demo_history),
  u (recno, uniq, sym, mchty, rn, svc, new_svc) as (
    select recno, uniq, sym, mchty, rn, svc, 1 new_svc from t where rn = 1
    union all 
    select t.recno, t.uniq, t.sym, t.mchty, t.rn, t.svc, 
        case when u.mchty= '6' then u.new_svc+1 else u.new_svc end
      from t join u on t.uniq = u.uniq and t.rn = u.rn+1 )
select recno, uniq, sym, mchty, rn, svc, new_svc
  from u order by uniq, recno

SQLFiddle

Answer given by user @EatÃ…Peach is probably what you should use here, it is fast and suits your needs.

But there is other possibility worth mentioning and answering your question: Is there a way to get a conditional increment based on cycles in data, and carry it forward in a query field? - recursive CTE introduced in Oracle 11g.

The main part is subquery u unioning first rows for each uniq and cycling next rows with your row_number. For each row I am checking if previous value of mchty was '6' and if so - incrementing new_svc.