Query jsonb array for integer member

3.8k views Asked by At

Background: We use PaperTrail to keep the history of our changing models. Now I want to query for a Item, which belonged to a certain customer. PaperTrail optionally stores the object_changes and I need to query this field to understand, when something was created with this ID or changed to this ID.

My table looks simplified like this:

item_type | object_changes
----------|----------------------------------------------------------
"Item"    | {"customer_id": [null, 5], "other": [null, "change"]}
"Item"    | {"customer_id": [4, 5], "other": ["unrelated", "change"]}
"Item"    | {"customer_id": [5, 6], "other": ["asht", "asht"]}

How do I query for elements changed from or to ID 5 (so all rows above)? I tried:

SELECT * FROM versions WHERE object_changes->'customer_id' ? 5;

Which got me:

ERROR:  operator does not exist: jsonb ? integer
LINE 1: ...T * FROM versions WHERE object_changes->'customer_id' ? 5;
                                                                 ^
HINT:  No operator matches the given name and argument type(s).
You might need to add explicit type casts.
2

There are 2 answers

0
Erwin Brandstetter On

For jsonb the contains operator @> does what you ask for:

Get all rows where the number 5 is an element of the "customer_id" array:

SELECT *
FROM   versions 
WHERE  object_changes->'customer_id' @> '5';

The @> operator expects jsonb as right operand - or a string literal that is valid for jsonb (while ? expects text). The numeric literal without single quotes you provided in your example (5) cannot be coerced to jsonb (nor text), it defaults to integer. Hence the error message. Related:

This can be supported with different index styles. For my query suggested above, use an expression index (specialized, small and fast):

CREATE INDEX versions_object_changes_customer_id_gin_idx ON versions
USING gin ((object_changes->'customer_id'));

This alternative query works, too:

SELECT * FROM versions WHERE object_changes @> '{"customer_id": [5]}';

And can be supported with a general index (more versatile, bigger, slower):

CREATE INDEX versions_object_changes_gin_idx ON versions
USING gin (object_changes jsonb_path_ops);

Related:

According to the manual, the operator ? searches for any top-level key within the JSON value. Testing indicates that strings in arrays are considered "top-level keys", but numbers are not (keys have to be strings after all). So while this query would work:

SELECT * FROM versions WHERE object_changes->'other' ? 'asht';

Your query looking for a number in an array will not (even when you quote the input string literal properly). It would only find the (quoted!) string "5", classified as key, but not the (unquoted) number 5, classified as value.

Aside: Standard JSON only knows 4 primitives: string, number, boolean and null. There is no integer primitive (even if I have heard of software adding that), integer is a just a subset of number, which is implemented as numeric in Postgres:

So your question title is slightly misleading as there are no "integer" members, strictly speaking.

0
dmfay On

Use a lateral join and the jsonb_array_elements_text function to process each row's object_changes:

SELECT DISTINCT v.* FROM versions v
JOIN LATERAL jsonb_array_elements_text(v.object_changes->'customer_id') ids ON TRUE
WHERE ids.value::int = 5;

The DISTINCT is only necessary if the customer_id you're looking for could appear multiple times in the array (if a different field changed but customer_id is tracked anyway).