Selecting an aggregate and a column with the same SQL query?

41 views Asked by At

I am competing in a SQL database competition, and occasionally I come across problems like

SELECT name, MIN(a) 
FROM table

According to the answer sheet, the outputted name is indeterminate. However, in a later practice test I am asked to find the exact output of such a query, implying there is a way to determine the returned name value.

The answer keys have a tendency of sometimes being mistaken (too often for comfort). So I ran a query like that one in the W3Schools try-it-yourself box, and got an error. However, I tried it in Codecademy, and it returned the name value associated with the minimal "a" value.

There's so much contradicting information that I don't know what to think!

1

There are 1 answers

0
Atmo On

As a general rule: if you do an aggregation in your SELECT clause, and some columns are not aggregated, they should be listed in a GROUP BY clause.

As an example, this is what is described for Postgres, with my highlight in bold:

When GROUP BY is present, or any aggregate functions are present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or when the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

MySQL also has a neat help page on the topic for a reason that I will detail later.

If ONLY_FULL_GROUP_BY is disabled, a MySQL extension to the standard SQL use of GROUP BY permits the select list, HAVING condition, or ORDER BY list to refer to nonaggregated columns even if the columns are not functionally dependent on GROUP BY columns.

As written, MySQL accepts queries where columns are omitted from the GROUP BY clause, including the case where all the columns are omitted (= the entire clause is omitted) but this is an exception to the standard.
If the ONLY_FULL_GROUP_BY option is off, then it will allow such a query to be executed but the value you will get for the non aggregated column name will be the first that the database encounters when reading the table.
To my knowledge, MySQL is the only DBMS allowing such queries. I think most people will agree you should never omit non-aggregated columns even when the DBMS allows it, because it is non-standard.

To be complete and if you read both quotes carefully, you must have noticed the standard actually allows non-aggregated columns to be omitted in 1 case: if they are functionally dependent with the aggregated columns.

My conclusion is that you managed to make your query work on an instance of MySQL (Codecademy) and it failed elsewhere.