SQL query to reduce number of rows using grouping values in lesser number of rows

65 views Asked by At

I have set of data which looks like below

Given Data

I want to reduce the rows in it. If possible to reduce up to single row then it would be considered as best case.

Final data set should something like this Expected data

I tried Min and Max functions in SQL. But I don't want to loose any data. Just want to fill blank cells so that readability of report.

If anyone need sample data then you may use below query

select * from
(
SELECT 9999 as primary_key ,1 AS column_1, 2 as column_2, NULL  as column_3, NULL as column_4 FROM DUAL
UNION ALL
SELECT 9999, NULL, NULL, 3, 4 FROM DUAL
UNION ALL
SELECT 9999, 5, 6, NULL, NULL FROM DUAL
)
1

There are 1 answers

0
Adrian Maxwell On

I suggest working the current data into an "unpivoted state" that excludes NULL values and whilst doing this add a row number into the mix. Then once that is available then re-pivot that interim result into the wanted final format using both the primary_key and the row number:

WITH UNPIV AS (
    SELECT PRIMARY_KEY, 'column_1' AS COLUMN_NAME, column_1 AS COLUMN_VALUE
           , ROW_NUMBER() OVER(PARTITION BY PRIMARY_KEY ORDER BY column_1) AS RN
    FROM your_table
    WHERE column_1 IS NOT NULL
    
    UNION ALL
    
    SELECT PRIMARY_KEY, 'column_2' AS COLUMN_NAME, column_2 AS COLUMN_VALUE
           , ROW_NUMBER() OVER(PARTITION BY PRIMARY_KEY ORDER BY column_2) AS RN
    FROM your_table
    WHERE column_2 IS NOT NULL
    
    UNION ALL
    
    SELECT PRIMARY_KEY, 'column_3' AS COLUMN_NAME, column_3 AS COLUMN_VALUE
           , ROW_NUMBER() OVER(PARTITION BY PRIMARY_KEY ORDER BY column_3) AS RN
    FROM your_table
    WHERE column_3 IS NOT NULL
    
    UNION ALL
    
    SELECT PRIMARY_KEY, 'column_4' AS COLUMN_NAME, column_4 AS COLUMN_VALUE
           , ROW_NUMBER() OVER(PARTITION BY PRIMARY_KEY ORDER BY column_4) AS RN
    FROM your_table
    WHERE column_4 IS NOT NULL
    )
SELECT PRIMARY_KEY, 
       MAX(CASE WHEN COLUMN_NAME = 'column_1' THEN COLUMN_VALUE END) AS COLUMN_1,
       MAX(CASE WHEN COLUMN_NAME = 'column_2' THEN COLUMN_VALUE END) AS COLUMN_2,
       MAX(CASE WHEN COLUMN_NAME = 'column_3' THEN COLUMN_VALUE END) AS COLUMN_3,
       MAX(CASE WHEN COLUMN_NAME = 'column_4' THEN COLUMN_VALUE END) AS COLUMN_4
FROM UNPIV
GROUP BY PRIMARY_KEY, RN
ORDER BY PRIMARY_KEY, RN
PRIMARY_KEY COLUMN_1 COLUMN_2 COLUMN_3 COLUMN_4
9999 1 2 3 4
9999 5 6 null null

fiddle

Note the arrangement of values into these rows is determined by the row_number() calculation. So there is no guarantee that values will re-align into the same rows that they originated from.