From the table below, I am trying to create another column (last_k2X) that will show the last key2 where type was set to 'X' over time (ti).
In case X was set at the same time (ts) for multiple key2, (in same key1 partition), the new column last_k2X will have the key2 for all key2 in that partition/time.
Input:
| key1 | key2 | ts | type |
|---|---|---|---|
| 1 | A | t0 | |
| 1 | B | t1 | a |
| 1 | C | t1 | X |
| 1 | D | t2 | b |
| 1 | E | t3 | |
| 1 | F | t4 | c |
| 1 | G | t5 | X |
| 1 | H | t5 | |
| 1 | I | t6 | d |
I tried to use window functions like FIRST_VALUE() and LAG() but could not manage to have the right results. I expected the result like:
Expected output:
| key1 | key2 | ts | type | last_k2X |
|---|---|---|---|---|
| 1 | A | t0 | ||
| 1 | B | t1 | a | C |
| 1 | C | t1 | X | C |
| 1 | D | t2 | b | C |
| 1 | E | t3 | C | |
| 1 | F | t4 | c | C |
| 1 | G | t5 | X | G |
| 1 | H | t5 | G | |
| 1 | I | t6 | d | G |
As you didn't specify the database system you are using, this solution is in MySQL, but can be simply converted to other database systems.
This is basically a gaps and island problem, but need to get the right partitions
fiddle