How to work out the path for all values in a JSON

523 views Asked by At

I have a JSON column that has about 300k worth of properties, and I want to attempt to export it in such a way that I can get the full path and value of the property. Note, my JSON has no arrays - it's basically a JSON for a web application translations.

To provide a simple example with the following JSON I want to get back the following:

JSON: {"Name":"Jeff", "Address": {"Street": "123 Harrow Street", "City": "Dublin"}}

Expected Output:

|----------------|-------------------|
| path           |  value            |
|----------------|-------------------|
| Name           | Jeff              |
| Address.Street | 123 Harrow Street |
| Address.City   | Dublin            |
|----------------|-------------------|

Or if it's easier I can do with the $. because I'd like to be able to easily update the values for each property - with JSON_MODIFY.

I've tried using OPENJSON but that only appears to return 3 fields key, value and type, but at least the key here doesn't go past the field level of values, so I get: Query:

DECLARE @json_doc nvarchar(4000) = '{"Name":"Jeff", "Address": {"Street": "123 Harrow Street", "City": "Dublin"}}';

SELECT [key], [value]
FROM OPENJSON(@json_doc);
GO

Output:

|---------|---------------------------------------------------|
| key     |  value                                            |
|---------|---------------------------------------------------|
| Name    | Jeff                                              |
| Address | {"Street": "123 Harrow Street", "City": "Dublin"} |
|---------|---------------------------------------------------|

Is there a way to get the OPENJSON query to be fully recursive? Or is there another way? I've tried googling but it doesn't appear to be a common request?

2

There are 2 answers

1
Zhorov On BEST ANSWER

A recursive CTE is an option. You need to use ISJSON() as a termination condition in the recursive member of the query:

DECLARE @json_doc nvarchar(4000) = '{"Name":"Jeff", "Address": {"Street": "123 Harrow Street", "City": "Dublin"}}';

;WITH rCTE AS (
   SELECT 
       CONVERT(nvarchar(max), N'$') COLLATE DATABASE_DEFAULT AS [path], 
       CONVERT(nvarchar(max), JSON_QUERY(@json_doc, '$')) COLLATE DATABASE_DEFAULT AS [value]
   UNION ALL
   SELECT 
      CONVERT(nvarchar(max), CONCAT(r.path, CONCAT(N'.', c.[key]))) COLLATE DATABASE_DEFAULT ,
      CONVERT(nvarchar(max), c.[value]) COLLATE DATABASE_DEFAULT                                        
   FROM rCTE r
   CROSS APPLY OPENJSON(r.[value]) c
   WHERE ISJSON(r.[value]) = 1
)
SELECT *
FROM rCTE
WHERE ISJSON([value]) = 0

Result:

path             value
----------------------------------
$.Name           Jeff
$.Address.Street 123 Harrow Street
$.Address.City   Dublin
0
Bas On

Based on @Zhorov's answer, I've tweaked the constructed path to use [ and ] around the index, when the node is an array. And turned it into a table-valued function.

With the node [type] already returned by OPENJSON(), there is no need to test every node with isJSON().

create function extract_path_and_values_from_json
     ( @json_doc     nvarchar(max)   
     )
       returns @json_values table
             ( json_path    nvarchar(max)   null
             , json_value   nvarchar(max)   null
             , json_type    int             null
             )
    as
begin
    with rCTE as 
       ( select convert(nvarchar(max), N'$') collate DATABASE_DEFAULT as [path] 
              , convert(nvarchar(max), json_query(@json_doc, '$')) collate DATABASE_DEFAULT as [value]
              , convert(int, case left(json_query(@json_doc, '$'), 1) when N'[' then 4 when N'{' then 5 else null end) as [type]
         union all
         select convert(nvarchar(max), concat(r.path, case when r.[type] = 4 then concat(N'[', c.[key], N']') else concat(N'.', c.[key]) end /*case*/)) collate DATABASE_DEFAULT  as [path]
              , convert(nvarchar(max), c.[value]) collate DATABASE_DEFAULT  as [value]                                      
              , convert(int, c.[type]) as [type]
           from rCTE r
          cross apply openjson(r.[value]) c
          where r.[type] in (4, 5) /*0:null; 1:string; 2:number; 3:boolean; 4:array; 5:object*/ 
       )
    insert @json_values
    select [path]
         , [value]
         , [type]
      from rCTE
     where [type] in (0, 1, 2, 3) /*0:null; 1:string; 2:number; 3:boolean; 4:array; 5:object*/ 
    
    return
    
end