MySQL Multiple Conditions on Group By / Having Clause

1.2k views Asked by At

I have three tables that are all inter-related with the following structure.

ModuleCategory Table:

+------------------+----------------+------------+
| ModuleCategoryID | ModuleCategory | RequireAll |
+------------------+----------------+------------+
|               90 | Cat A          | YES        |
|               91 | Cat B          | NO         |
+------------------+----------------+------------+

ModuleCategorySkill Table:

+------------------+---------+
| ModuleCategoryID | SkillID |
+------------------+---------+
|               90 |    1439 |
|               90 |    3016 |
|               91 |    1440 |
|               91 |    3016 |
+------------------+---------+

EmployeeSkill Table:
+---------+---------+
| EmpName | SkillID |
+---------+---------+
| Emp1    |    1439 |
| Emp1    |    3016 |
| Emp2    |    1440 |
| Emp2    |    3016 |
| Emp3    |    1439 |
| Emp4    |    3016 |
+---------+---------+

Desired Output:

+------------------+-------+
| ModuleCategory   | Count |
+------------------+-------+
|            Cat A |     1 |
|            Cat B |     3 |
+------------------+-------+

I am trying to group by ModuleCategoryID's and get the count of employees which have the skills being tracked.

Normally, I can do the following query to obtain the numbers:

select mc.ModuleCategory, Count(*) as Count from ModuleCategory as mc 
join ModuleCategorySkill as mcs on mc.ModuleCategoryID = mcs.ModuleCategoryID join EmployeeSkill as es on es.SkillID= mcs.SkillID 
group by mc.ModuleCategoryID

However, I have a column RequireAll in the ModuleCategory table which if it is set to 'YES' should only count employees as 1 only if they have all the skills in the category. If it is set to NO then it can count each row normally and increase the count by the number of rows it groups by.

I can achieve this by writing separate queries for each modulecategoryID and using a having Count() > 1 (which will find me anyone that has all the skills for ModuleCategoryID 90). If there were 3 skills than I would have to change it to Having Count() > 2. If there isn't anyone that has all the skills specified, the count should be 0.

I need a dynamic way of being able to do this since there is a lot of data and writing one query for each ModuleCategoryID isn't the proper approach.

Also, I am using PHP so I can loop through and create a sql string that can help me achieve this. But I know I will run into performance issues on big tables with a lot of skills and modulecategoryID's.

Any guidance on how to achieve this is much appreciated.

1

There are 1 answers

3
pala_ On BEST ANSWER

You can do it by joining on the total category counts, and then using conditional aggregation:

select modulecategory, 
       count(case when requireall = 'yes'
               then if(s = t, 1, null)
               else s
             end) 
from (             
select modulecategory,empname, requireall, count(*) s, min(q.total) t
  from employeeskill e
    inner join modulecategoryskill mcs
      on e.skillid = mcs.skillid
    inner join modulecategory mc
      on mcs.modulecategoryid = mc.modulecategoryid
    inner join (
      select modulecategoryid, count(*) total
        from modulecategoryskill
        group by modulecategoryid
    ) q
    on mc.modulecategoryid = q.modulecategoryid
  group by modulecategory, empname
  ) qq
group by modulecategory;

demo here

This operates under the assumption an employee isn't going to be allocated the same skill twice, if that is something that may happen, this query is alterable to support it, but it seems like a broken scenario to me.

What we have here is an inner query that collates all the information we need (category name, employee name, whether or not all skills are required, how many skills are in the group per employee, and how many there in the group total), with an outer query that uses a conditional count to change how the rows are tallied, based on the value of requireall.