SQL Databricks - LAG function BY distinct values

106 views Asked by At

I have a table with two columns (number and id) and would like to calculate value_needed column.

Column definition - replace 8 by previous number + 1.

But by using simple LAG function I receive that output:

number id value_current value_needed
1 123 1 1
1 123 1 1
2 123 2 2
8 123 3 3
8 123 9 3

I need to achive value_needed column (for all 8 have '3' as an output)

My code :

CASE WHEN number = 8 
           THEN LAG(number) OVER (PARTITION BY (id) ORDER BY number) + 1
           ELSE number

I need to define that in CASE statement without WHERE condition. I've tried to add case when to partition by but it doesn't work properly

1

There are 1 answers

0
JNevill On

Given your data and the desired output, it sounds like DENSE_RANK() is the agg window function you want:

 DENSE_RANK() OVER (PARTITION BY id ORDER BY number)

An example using your sample data:

WITH cte AS (
  SELECT 1 as num, 123 as id 
  UNION ALL SELECT 1, 123
  UNION ALL SELECT 2, 123
  UNION ALL SELECT 8, 123 
  UNION ALL SELECT 8, 123 
)
SELECT num, id, DENSE_RANK() OVER (PARTITION BY id ORDER BY num) as value_needed
FROM cte

Results:

+-----+-----+--------------+
| num | id  | value_needed |
+-----+-----+--------------+
|   1 | 123 |            1 |
|   1 | 123 |            1 |
|   2 | 123 |            2 |
|   8 | 123 |            3 |
|   8 | 123 |            3 |
+-----+-----+--------------+