SQL query to loop through records

388 views Asked by At

I have a table with a million records. This is the structure of the table with some example data points -

patient   claim   thru_dt   cd   start     
322       65      20200201  42   20181008  
322       65      20200202  42             
322       95      20200203  52             
122       05      20200105  23             
122       05      20200115  42   20190102  
122       05      20200116  42            

I need to write a query that would produce this output -

patient   claim   thru_dt   cd   start     
322       65      20200201  42   20181008    
322       65      20200202  42   20181008    
322       95      20200203  52   20181008    
122       05      20200105  23               
122       05      20200115  42   20190102    
122       05      20200416  42      

The reason the second claim of patient 322 was given 20181008 is because both the first and the second one have the same cd value.

The reason the third claim of patient 322 was also given 20181008 value even though it doesn't have the same cd value is because it is the last claim for the patient.

The reason the first claim of patient 122 is still a NULL is because that claim's cd value is not equal to 42.

The reason the third claim of patient 122 was NOT given a value of 20190102 even though it has the same cd value is because the thru_dt in their prior claim is more than 30 days apart.

This is what I have tried so far -

--This orders claims using row_number
DECLARE @min_record int;
DECLARE @max_record int;

select 
    @min_record = MIN(row_num), 
    @max_record = MAX(row_num) 
from 
(
    select *, 
    row_number() over(partition by patient order by thru_dt) as row_num
    from 
    table
)

while @min_record <= @max_record
begin 
    --Logic I need help with 
    SET @min_record = @min_record + 1
end 
1

There are 1 answers

1
GMB On BEST ANSWER

I think a lateral join and conditional expressions make it simpler to implement the logic you want:

select t.*,
    case 
        when t.start is null and (
            s.cd = t.cd 
            or row_number() over(partition by t.patient order by t.thru_dt desc) = 1
        )
        then s.start
        else t.start
    end new_start
from mytable t
outer apply (
    select top (1) s.*
    from mytable s
    where 
        s.patient = t.patient 
        and s.start is not null
        and s.thru_dt >= dateadd(day, -30, t.thru_dt)
    order by s.thru_dt desc
) s
order by patient desc, thru_dt

Demo on DB Fiddle:

patient | claim | thru_dt    | cd | start      | new_start 
------: | ----: | :--------- | -: | :--------- | :---------
    322 |    65 | 2020-02-01 | 42 | 2018-10-08 | 2018-10-08
    322 |    65 | 2020-02-02 | 42 | null       | 2018-10-08
    322 |    95 | 2020-02-03 | 52 | null       | 2018-10-08
    122 |     5 | 2020-01-05 | 23 | null       | null      
    122 |     5 | 2020-01-15 | 42 | 2019-01-02 | 2019-01-02
    122 |     5 | 2020-04-16 | 42 | null       | null