After writing SQL for a few years, I find it often annoying having to put the columns I am interested in SELECT
and then specify them again in GROUP BY
. I can't help thinking, why do we have to do that?
What is the reason that user has to be specific about which column to group by? Can't we just let the SQL engine assume if there is an aggregate function in SELECT
, group by the rest non-aggregate columns?
This will be especially helpful and more concise when you have a large CASE WHEN
in SELECT
.
Because they may not always match exactly.
For example, If I want find out the maximum number of books per category, I could do:
In some DBs such as Oracle, you can even do this:
I don't really need to specify the category column as I don't need it.
A few databases such as Postgres support the use of aliases in the group by clause.