Find most common elements in array with a group by

Question

Find most common elements in array with a group by

3.1k views Asked by mhkeller At 09 January 2014 at 23:24

I have a table of rows with the following structure name TEXT, favorite_colors TEXT[], group_name INTEGER where each row has a list of everyone's favorite colors and the group that person belongs to. How can I GROUP BY group_name and return a list of the most common colors in each group?

Could you do a combination of int[] && int[] to set for overlap, int[] & int[] to get the intersection and then something else to count and rank?

Original Q&A

There are 1 answers

**Erwin Brandstetter** · Accepted Answer · 2014-01-09T23:56:13+00:00

Quick and dirty:

SELECT group_name, color, count(*) AS ct
FROM (
   SELECT group_name, unnest(favorite_colors) AS color
   FROM   tbl
   ) sub
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

Better with a `LATERAL JOIN`

In Postgres 9.3 or later this is the cleaner form:

SELECT group_name, color, count(*) AS ct
FROM   tbl t, unnest(t.favorite_colors) AS color
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

The above is shorthand for

...
FROM tbl t
JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE
...

And like with any other INNER JOIN, it would exclude rows without color (favorite_colors IS NULL) - as did the first query.

To include such rows in the result, use instead:

SELECT group_name, color, count(*) AS ct
FROM   tbl t
LEFT   JOIN LATERAL unnest(t.favorite_colors) AS color ON TRUE
GROUP  BY 1,2
ORDER  BY 1,3 DESC;

You can easily aggregate the "most common" colors per group in the next step, but you'd need to define "most common colors" first ...

Most common colors

As per comment, pick colors with > 3 occurrences.

SELECT t.group_name, color, count(*) AS ct
FROM   tbl t, unnest(t.favorite_colors) AS color
GROUP  BY 1,2
HAVING count(*) > 3
ORDER  BY 1,3 DESC;

To aggregate the top colors in an array (in descending order):

SELECT group_name, array_agg(color) AS top_colors
FROM  (
   SELECT group_name, color
   FROM   tbl t, unnest(t.favorite_colors) AS color
   GROUP  BY 1,2
   HAVING count(*) > 3
   ORDER  BY 1, count(*) DESC
   ) sub
GROUP BY 1;

-> SQLfiddle demonstrating all.

TechQA.

Find most common elements in array with a group by

There are 1 answers

Better with a `LATERAL JOIN`

Most common colors

Related Questions in SQL

Related Questions in POSTGRESQL

Related Questions in AGGREGATE-FUNCTIONS

Related Questions in POSTGRESQL-9.3

Related Questions in LATERAL

Popular Questions

Popular Tags

Trending Questions

Find most common elements in array with a group by

There are 1 answers

Better with a LATERAL JOIN

Most common colors

Related Questions in SQL

Related Questions in POSTGRESQL

Related Questions in AGGREGATE-FUNCTIONS

Related Questions in POSTGRESQL-9.3

Related Questions in LATERAL

Popular Questions

Popular Tags

Trending Questions

Better with a `LATERAL JOIN`