Unnest (pivot) arrays in Postgres pg_stats view

Question

Unnest (pivot) arrays in Postgres pg_stats view

386 views Asked by TPPZ At 08 September 2022 at 10:47

I would like to pivot the arrays in the cells inside the Postgres view pg_stats. The reason for this is to better understand what is going on with the table EXPLAIN ANALYZE commands.

I have this SQL query that I am modifying without success in order to achieve the pivot:

SELECT 
    -- ps.schemaname,
    -- ps.tablename,
    ps.attname as "column_name",
    ps.null_frac,
    ps.avg_width,
    ps.n_distinct,
    ps.most_common_vals,
    ps.most_common_freqs,
    ps.histogram_bounds,
    ps.correlation -- , -- I think this means how close the stats are to the "real rows in the entire table"
    -- unnest(ps.most_common_vals, ps.histogram_bounds)
FROM pg_stats AS ps
CROSS JOIN LATERAL unnest(ps.most_common_vals, ps.histogram_bounds)
WHERE 
        ps.tablename = 'my_table_name'
    AND ps.attname IN ('my_important_column_1', 'my_important_column_2');

However I hit the wall on all sorts of error messages due to the function unnest(...) (see https://www.postgresql.org/docs/9.4/functions-array.html):

If I use the function unnest(...) in a column inside the SELECT then I see this error:

SQL Error [42883]: ERROR: function unnest(anyarray, anyarray) does not exist
  Hint: No function matches the given name and argument types. You might need to add explicit type casts.
  Position: 325

If I use unnest(...) in the FROM with that CROSS JOIN LATERAL, then I see this error:

SQL Error [42804]: ERROR: cannot determine element type of "anyarray" argument

On the types issues, I am interested in these columns of various types "array" (see https://www.postgresql.org/docs/current/view-pg-stats.html):

pg_stats.most_common_vals type anyarray
pg_stats.most_common_freqs type float4[]
pg_stats.histogram_bounds type anyarray

If I use all of these columns inside unnest(...) then because types are different (basically because of float4[] I think), I see this error:

SQL Error [42883]: ERROR: function unnest(anyarray, real[], anyarray) does not exist
  Hint: No function matches the given name and argument types. You might need to add explicit type casts.
  Position: 387

Or

SQL Error [42804]: ERROR: cannot determine element type of "anyarray" argument

depending on where I put unnest (as column, or as filter).

How do I pivot all these arrays inside column values?
How do I use casting to apply this pivot to arrays of different types?

I am a bit lost with the Postgres syntax here.

EDIT:

I could do stuff like this in the SELECT: array_to_string(ps.most_common_vals, E'\n') as most_common_vals_lf,

But this leads to "fat cells" that don't render very well in a visual IDE e.g. DBeaver because the line feeds don't expand very well and all the output rows hide most of the cell value just because they don't expand. I really would like to pivot the view and multiply the common columns by the number N of elements in the arrays and get back all these as N rows.

Original Q&A

There are 1 answers

**TPPZ** · Accepted Answer · 2022-09-09T06:45:01+00:00

Turns out the issue was around casting the types of those arrays. This works when making sure the arrays types are ::text::text[]:

SELECT 
    -- ps.schemaname,
    -- ps.tablename,
    ps.attname as "column_name",
    mt.mcv AS "most_common_vals",
    mt.mcf AS "most_common_freqs",
    ps.avg_width,
    ps.n_distinct,
    ps.correlation,
    ps.null_frac -- fraction of column entries that are null
FROM pg_stats AS ps
-- for scalar type columns:
CROSS JOIN LATERAL unnest(
    ps.most_common_vals::text::text[], 
    ps.most_common_freqs::text::text[]
) AS mt(mcv, mcf)
WHERE
        ps.tablename = 'my_table_name'
    AND ps.attname IN ('my_important_column_1', 'my_important_column_2')
    AND mt.mcv IS NOT NULL
    AND mt.mcf IS NOT NULL
ORDER BY ps.attname, mt.mcf DESC;

The output is the expected pivot of the original view pg_stats with one row for each element of those arrays.

One gotcha on ps_stats: the pairs of most_common_vals and most_common_freqs can be treated as a map. So in this case it makes sense to pivot the table with both pivoted columns.

On the other hand the histogram column histogram_bounds is independent from the above. It just shows the 10 bands to go from 0% to 100% in steps of 10%. It doesn't make sense to mix this column with the 2 columns above. This histogram column very much depend on the data we are dealing with, sometimes it doesn't even make sense to take this information into consideration at all.

TechQA.

Unnest (pivot) arrays in Postgres pg_stats view

There are 1 answers

Related Questions in ARRAYS

Related Questions in POSTGRESQL

Related Questions in SQL-EXECUTION-PLAN

Related Questions in EXPLAIN

Related Questions in PG-CATALOG

Popular Questions

Popular Tags

Trending Questions