Aggregate columns with additional (distinct) filters

4.4k views Asked by At

This code works as expected, but I it's long and creepy.

select p.name, p.played, w.won, l.lost from

(select users.name, count(games.name) as played
from users
inner join games on games.player_1_id = users.id
where games.winner_id > 0
group by users.name
union
select users.name, count(games.name) as played
from users
inner join games on games.player_2_id = users.id
where games.winner_id > 0
group by users.name) as p

inner join

(select users.name, count(games.name) as won
from users
inner join games on games.player_1_id = users.id
where games.winner_id = users.id
group by users.name
union
select users.name, count(games.name) as won
from users
inner join games on games.player_2_id = users.id
where games.winner_id = users.id
group by users.name) as w on p.name = w.name

inner join

(select users.name, count(games.name) as lost
from users
inner join games on games.player_1_id = users.id
where games.winner_id != users.id
group by users.name
union
select users.name, count(games.name) as lost
from users
inner join games on games.player_2_id = users.id
where games.winner_id != users.id
group by users.name) as l on l.name = p.name

As you can see, it consists of 3 repetitive parts for retrieving:

  • player name and the amount of games they played
  • player name and the amount of games they won
  • player name and the amount of games they lost

And each of those also consists of 2 parts:

  • player name and the amount of games in which they participated as player_1
  • player name and the amount of games in which they participated as player_2

How could this be simplified?

The result looks like so:

           name            | played | won | lost 
---------------------------+--------+-----+------
 player_a                  |      5 |   2 |    3
 player_b                  |      3 |   2 |    1
 player_c                  |      2 |   1 |    1
3

There are 3 answers

0
Erwin Brandstetter On BEST ANSWER

Postgres 9.4 or newer

Use the standard-SQL aggregate FILTER clause:

SELECT u.name
     , count(*) FILTER (WHERE g.winner_id  > 0)    AS played
     , count(*) FILTER (WHERE g.winner_id  = u.id) AS won
     , count(*) FILTER (WHERE g.winner_id <> u.id) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;

Only rows that pass the boolean expression in the FILTER clause contribute to the aggregate.

Any Postgres version

SELECT u.name
     , count(g.winner_id  > 0 OR NULL)    AS played
     , count(g.winner_id  = u.id OR NULL) AS won
     , count(g.winner_id <> u.id OR NULL) AS lost
FROM   games g
JOIN   users u ON u.id IN (g.player_1_id, g.player_2_id)
GROUP  BY u.name;

Older versions need a workaround. This is shorter and faster than nested sub-selects or CASE expressions. See:

0
Gordon Linoff On

This is a case where correlated subqueries may simplify the logic:

select u.*, (played - won) as lost
from (select u.*,
             (select count(*)
              from games g
              where g.player_1_id = u.id or g.player_2_id = u.id
             ) as played,
             (select count(*)
              from games g
              where g.winner_id = u.id
             ) as won
      from users u
     ) u;

This assumes that there are no ties.

1
Multisync On
select users.name, 
       count(case when games.winner_id > 0 
                  then games.name 
                  else null end) as played,
       count(case when games.winner_id = users.id 
                  then games.name 
                  else null end) as won,
       count(case when games.winner_id != users.id 
                  then games.name 
                  else null end) as lost
from users inner join games 
     on games.player_1_id = users.id or games.player_2_id = users.id
group by users.name;