I built a table in Amazon QuickSight, but the way the viewer wants the data organized in a pivot table, I'm only able to achieve with the below SQL snippet. I don't have a problem leaving it as is, but I'd like to see if there are better ways of doing what I did, especially without using the repetitive parts in each individual table that is being combined together. There are actually a lot more "problem" columns I'm querying, so the actual SQL is very long and very repetitive.
The only part that is changed is problem and problem_count between each table.
'Service' AS problem,
COUNT(CASE WHEN service_yn = 'Yes' THEN 1 END) AS problem_count
I'm sorry, I don't know how to properly format a pivot table but I'll do my best here. Imagine the following 3 tables are one large pivot table.
| Base | Count |
|---|---|
| Accuracy of Bill | 110 |
| Service | 200 |
| Cleanliness | 95 |
| Value | 75 |
| Premium | Count |
|---|---|
| Accuracy of Bill | 110 |
| Service | 200 |
| Cleanliness | 95 |
| Value | 75 |
| Deluxe | Count |
|---|---|
| Accuracy of Bill | 110 |
| Service | 200 |
| Cleanliness | 95 |
| Value | 75 |
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Accuracy of bill' AS problem,
COUNT(CASE WHEN accuracy_of_bill_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM excluded)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
UNION ALL
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Cleanliness' AS problem,
COUNT(CASE WHEN cleanliness_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM excluded)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
UNION ALL
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Service' AS problem,
COUNT(CASE WHEN service_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM excluded)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
UNION ALL
SELECT CASE
WHEN descriptor = 'Fast Food' THEN 'Base'
WHEN descriptor IN ('Sit Down','Bar','Eatery') THEN 'Premium'
WHEN descriptor IN ('Full Service','Boutique') THEN 'Deluxe'
ELSE NULL
END AS "brand_group",
'Value' AS problem,
COUNT(CASE WHEN value_yn = 'Yes' THEN 1 END) AS problem_count
FROM surveys
WHERE responsedate >= CURRENT_DATE-INTERVAL '12 months'
AND surveyid NOT IN (SELECT surveyid FROM excluded)
AND region IN (1,2,3,4,5)
AND brand_group IS NOT NULL
GROUP BY brand_group
Note: this answer is for Postgres, as declared originally. Does not work for Redshift.
You need the reverse operation of a "pivot", a.k.a. "unpivot". Can be done elegantly with
CROSS JOIN LATERALto aVALUESexpression after getting all counts in a singleSELECT.A single scan over the table should be substantially faster.
While being at it, I optimized a couple other things.
fiddle
See:
Avoid
NOT INwith subqueries. I replaced it with:See:
All of your
*_yncolumns should really beboolean. Nottextor anything else.About the aggregate
FILTERclause:I use a "switched"
CASE. Should be slightly cheaper. See:Just the base?
Typically, people just want the more compact result from the base query (maybe with more descriptive column names):
fiddle