GCP - Bigquery with multiple conditions

1.1k views Asked by At

I have a table called StockList which is used to keep track of available stock details

I have inserted all the available stocks initially (inserted on 04-12-2022). My stockList table looks like this.

enter image description here

After initial load, Everyday(12 AM) I will insert records in this table based on the stock availability. i.e)On next day, my table looks like this.

enter image description here

Means, StockId - 12121 size is changed

StockId - 13131 is deleted

StockId - 14141 colour is changed

StockId - 15151 is same as yesterday

stockId - 16161 is newly added stock

Similarly I will keep on updating this table everyday 12AM with action date as that particular date. Iam not maintaining any action type(inserted/updated/deleted) and is_deleted column also.

So now I want a report like inserted, updated and deleted stock on particular day. E.g) On 04-13-2022 I need a report like this

enter image description here

Is it possible to get this in a single query? If yes, how can I get this in single optimized query? I am using Google - Bigquery.

I am trying like this,

WITH
  today AS (
  SELECT
    *
  FROM
    `myProject.Mydataset.StockList`
  WHERE
    ActionDate = '04-13-2022' ),
  yesterday AS (
  SELECT
    *
  FROM
   `myProject.Mydataset.StockList`  WHERE
   ActionDate = '04-12-2022' )  
.
.
.
.


I am not sure on this. Can anyone help on this? Thank you :)

1

There are 1 answers

0
C.Georgiadis On

This query can produce the desired report

WITH
  prep AS (
  SELECT
    StockId,
    Name,
    Size,
    Color,
    ActionDate,
    CreatedTs,
    MAX(ActionDate) OVER() as latest_action_date,
    ROW_NUMBER() OVER (PARTITION BY StockId ORDER BY ActionDate DESC) AS rn,
    LAG(Size) OVER (PARTITION BY StockId ORDER BY ActionDate ASC) as previous_size,
    LAG(Color) OVER (PARTITION BY StockId ORDER BY ACtionDate ASC) as previous_color
  FROM
    `myProject.Mydataset.StockList`)
SELECT * FROM (
SELECT
  StockId,
  Name,
  CASE 
    WHEN latest_action_date = DATE(CreatedTs) THEN 'Inserted'
    WHEN latest_action_date <> ActionDate THEN 'Deleted'
    WHEN Size <> previous_size OR Color <> previous_color THEN 'Updated'
    ELSE 'none'
  END AS Action
FROM
  prep
WHERE rn = 1) 
WHERE Action <> 'none'

The idea is to get the latest row for each StockId and then have a CASE-WHEN clause for each Action:

  • If the latest row for a StockId was created on the same date as the latest ActionDate then it was just inserted
  • If the latest row for a StockId has an ActionDate different than the most recent ActionDate observed in the table then it has been deleted
  • If the latest row for a StockId has a different Size or Color value from the previous row for the same StockId then is has been updated
  • In any other case we return a dummy 'none' value on which we exclude from the final select statement