Remove duplicate values in JSON Array

111 views Asked by At

I have this value in a field of a SQL Server Table:

["I3","I23","B1","B3","B2","B4","B6","I1","I11","I4","I14","I24","I34","I5","I15","I25","I35","I6","I16","I26","I36","I21","B5","I31","I11","I3","I1","I31","I21","I21","I5","I4","I3","I21","I4","I23","B1","I23","I3","B1","B2","B3","I15","I15","B2","I13","I2"]

actually it is a JSON Array

ome values are present more than 1 time: I need to remove the duplicate (keeping first occurrence) therefore the result should be something like this:

["I3","I23","B1","B3","B2","B4","B6","I1","I11","I4","I14","I24","I34","I5","I15","I25","I35","I6","I16","I26","I36","I21","B5","I31","I13","I2"]

I've tried several solutions and I seem to be close to solution but in the best solution I got the result as an array of objects instead of array of values:

[
     {
        "c": "B1"
     },
     {
        "c": "B2"
     },
     {
        "c": "B3"
     },
     {
        "c": "B4"
     },
     {
        "c": "B5"
     },
     {
        "c": "B6"
     },
     {
        "c": "I1"
     },
     {
        "c": "I11"
     },
     {
        "c": "I13"
     },
     {
        "c": "I14"
     },
     {
        "c": "I15"
     },
     {
        "c": "I16"
     },
     {
        "c": "I2"
     },
     {
        "c": "I21"
     },
     {
        "c": "I23"
     },
     {
        "c": "I24"
     },
     {
        "c": "I25"
     },
     {
        "c": "I26"
     },
     {
        "c": "I3"
     },
     {
        "c": "I31"
     },
     {
        "c": "I34"
     },
     {
        "c": "I35"
     },
     {
        "c": "I36"
     },
     {
        "c": "I4"
     },
     {
        "c": "I5"
     },
     {
        "c": "I6"
     }
  ]

How can I fix it?

1

There are 1 answers

5
GarethD On BEST ANSWER

You need to disaggregate your values using OPENJSON, remove duplicates with GROUP BY (taking the min key to preserve the order) then aggregate them back up using STRING_AGG (there is no in built JSON function to create an array of values AFAIK), so something like this:

DECLARE @T TABLE (JsonArray NVARCHAR(MAX));
INSERT @T (JsonArray) 
VALUES ('["I3","I23","B1","B3","B2","B4","B6","I1","I11","I4","I14","I24","I34","I5","I15","I25","I35","I6","I16","I26","I36","I21","B5","I31","I13","I2"]');

SELECT  t.JsonArray,
        DistinctArray = JSON_QUERY(CONCAT('[', STRING_AGG(QUOTENAME(oj.Value,'"'), ',') WITHIN GROUP (ORDER BY oj.[Key]), ']'))
FROM    @T AS t
        CROSS APPLY 
        (   SELECT  oj.Value, [Key] = MIN(CONVERT(INT, oj.[Key]))
            FROM    OPENJSON(t.JsonArray) AS oj
            GROUP BY oj.Value
        ) AS oj
GROUP BY t.JsonArray;

Example on db<>fiddle