Distinct on column using custom order

595 views Asked by At

I have a DB view.

At this point it returns rows which can have duplicate entries in id column. I would like to do the following:

SELECT DISTINCT ON (id) USING DESCRIBED BELOW RULE -- of course it is not valid sql
  id
  type

type column can have 3 values:

  • x
  • y
  • z

If there are rows with identical id but different type, the rule for DISTINCT is as follows:

  • select row with type x first
  • select row with type y if no x
  • select row with type z if no x and no y

So if I have 3 rows:

id    type
1     'y'
1     'x'
2     'z'

The expected result will be

id    type
1     'x'
2     'z'

Is there a way to achieve this?

2

There are 2 answers

3
Kaushik Nayak On BEST ANSWER

You may put the hierarchy in a CASE expression in ORDER BY

SELECT DISTINCT ON (id) * 
FROM   ( VALUES (1, 'y'), 
                (1, 'x'), 
                (2, 'z') ) s(id, type) 
ORDER  BY id, 
          CASE type 
            WHEN 'x' THEN 1 
            WHEN 'y' THEN 2 
            WHEN 'z' THEN 3 
          END; 

Demo

2
S-Man On

If your values are alphabetically ordered in real (as in your example) it would be quiet simple - you only have to order the values:

SELECT DISTINCT ON (id)
    id,
    type
FROM mytable m
ORDER BY id, type

demo:db<>fiddle

If not you'll need a table were the ranked values and their ranked order are stored. Or you simulate it with a subquery as I did:

SELECT DISTINCT ON (id)
    id,
    type
FROM mytable m
JOIN (VALUES (1, 'x'), (2, 'y'), (3, 'z')) AS r (rank_id,value)
ON r.value = m.type
ORDER BY id, r.rank_id

If your type data is really fixed you could think about an enum type which is an ordered type by default. The query would look like the first above:

demo: db<>fiddle

But notice that changing enums (adding, deleting, rearranging values) could be very hard. So it is only recommended if the type is really fixed. Otherwise take a separate table with foreign keys which takes you to back to the second part.