SQL in Oracle HR Schema

277 views Asked by At

I have made a query in Oracle HR schema to see the following information:

  1. The city where the department is located
  2. The total number of employees in the department

However, the query cannot be executed correctly and said this is "not a GROUP BY expression".

Does anyone knows what's the problem is? Thanks in advance.

SELECT department_name, city, COUNT(employees.department_id)
   FROM departments
       JOIN employees on (departments.department_id=employees.department_id)
         JOIN locations USING (location_id)
GROUP BY department_name;
3

There are 3 answers

1
Sujitmohanty30 On BEST ANSWER

The problem is you have both aggregated and non-aggregated column (in your case city in the select list.

As I don't know the structure of location table and considering a department have only one location defined you can use max(city),

SELECT department_name, max(city) city, COUNT(employees.department_id) no_of_employees
   FROM departments
       JOIN employees on (departments.department_id=employees.department_id)
         JOIN locations USING (location_id)
GROUP BY department_name;
0
Thorsten Kettner On

You are grouping by department and want to show the department's city. You expect this to work, because each department is in exactly one city. (SQL people call this functional dependency.)

For this to work, ...

  • there would have to be a unique contraint on the department name or you'd have to group by department_id instead
  • the DBMS must detect and support functional dependency in aggregation queries

Unfortunately, Oracle doesn't support functional dependency in aggregation queries. It forces us to put every such column in the GROUP BY clause or into an aggregation function.

So either extend the GROUP BY clause:

SELECT d.department_name, l.city, COUNT(e.department_id)
   FROM departments d
       JOIN employees e ON e.department_id = d.department_id
         JOIN locations l USING (location_id)
GROUP BY d.department_name, l.city
ORDER BY d.department_name;

or use some aggregation function as MIN or MAX on that single value.

SELECT d.department_name, MAX(l.city) AS city, COUNT(e.department_id)
   FROM departments d
       JOIN employees e ON e.department_id = d.department_id
         JOIN locations l USING (location_id)
GROUP BY d.department_name
ORDER BY d.department_name;

What I prefer though, is to aggregate first and only then join. You want to join the departments with their employee count, so do just that:

SELECT d.department_name, l.city, COALESCE(e.cnt, 0) AS employee_count
FROM departments d
JOIN locations l USING (location_id)
LEFT JOIN
(
   SELECT department_id, COUNT(*) as cnt
   FROM employees
   GROUP BY department_id
) e ON e.department_id = d.department_id
ORDER BY d.department_name;
0
T3SLA88 On

As excellently explained by Thorsten, you could also group the data using OVER and PARTITION BY function which would eliminate the use of GROUP BY function.

SELECT d.department_name, l.city, COUNT(e.department_id) OVER (PARTITION BY e.department_id) as emp_count
   FROM departments d
       JOIN employees e ON e.department_id = d.department_id
         JOIN locations l USING (location_id)
ORDER BY d.department_name;