Semver comparison using JQ

249 views Asked by At

I have a array that looks like this:

[
  {
    "id": 1,
    "version": "2.3.4"
  },
  {
    "id": 2,
    "version": "1.4.4"
  },
  {
    "id": 3,
    "version": "0.0.4"
  },
  {
    "id": 4,
    "version": "1.3.4"
  },
]

And I need to get all the objects where the version is "1.2.0". I am interested in a built in way using JQ but I cannot find anything related. Maybe it does not exist?

I know I could do some ugly regex hack here, but what would be the right way to solve this so I can easily swap my condition so if instead of 1.2.0 maybe in a short time in the future lets say I want the objects with version greater than 1.2.7 for instance?

3

There are 3 answers

0
Jeff Mercado On BEST ANSWER

You always have the option of parsing and implementing the comparisons.

def _parse_semver($with_op):
    if type == "string" then
        capture(if $with_op then "(?<op>[~])?" else "" end
        + "(?<major>\\d+)\\.(?<minor>\\d+)(?:\\.(?<patch>\\d+))?"
        + "(?:-(?<prerelease>[A-Z0-9]+(?:\\.[A-Z0-9]+)*))?"
        + "(?:\\+(?<build>[A-Z0-9]+(?:\\.[A-Z0-9]+)*))?"; "i")
        | (.major, .minor, .patch) |= (tonumber? // 0)
    elif type == "object" and ([has(("major,minor,patch,prerelease,build"/",")[])]|all) then .
    else empty end;
def parse_semver: _parse_semver(false);
def cmp_semver($other): parse_semver as $a | ($other|_parse_semver(true)) as $b |
    def _cmp($other): if . == $other then 0 elif . > $other then 1 else -1 end;
    def _cmp_dotted($other):
        if . == null then 1
        elif $other == null then -1
        else
            reduce ([split("."), ($other|split("."))] | transpose[]) as [$a, $b] (0;
                if . != 0 then .
                elif $a == null then -1
                elif $b == null then 1
                else
                    ($a|test("^\\d+$")) as $anum | ($b|test("^\\d+$")) as $bnum |
                    if [$anum,$bnum] == [true,true] then $a | tonumber | _cmp($b | tonumber)
                    elif $anum then -1
                    elif $bnum then 1
                    else $a | _cmp($b) end
                end
            )
        end;
    # slightly modified version of https://semver.org/#spec-item-11
    if $a.major != $b.major then
        if $a.major > $b.major then 1 else -1 end
    elif $a.minor != $b.minor then
        if $a.minor > $b.minor then 1 else -1 end
    elif $a.patch != $b.patch then
        if $a.patch > $b.patch then 1 else -1 end
    elif $b.op == "~" then
        0
    elif $a.prerelease != $b.prerelease then
        ($a.prerelease | _cmp_dotted($b.prerelease))
    elif $a.build != $b.build then
        ($a.build | _cmp_dotted($b.build))
    else
        0
    end;
def cmp_semver($first; $second): $first | cmp_semver($second);

Then utilize the new comparison functions:

$ jq 'map(select(.version | cmp_semver("1.2.0") > 0))' input.json
[                                                                                                                                     
  {                                                                                                                                   
    "id": 1,                                                                                                                          
    "version": "2.3.4"                                                                                                                
  },                                                                                                                                  
  {                                                                                                                                   
    "id": 2,                                                                                                                          
    "version": "1.4.4"                                                                                                                
  },                                                                                                                                  
  {                                                                                                                                   
    "id": 4,                                                                                                                          
    "version": "1.3.4"                                                                                                                
  }                                                                                                                                   
]

$ jq -rn '("1.0.0-alpha<1.0.0-alpha.1<1.0.0-alpha.beta<1.0.0-beta<1.0.0-beta.2<1.0.0-beta.11<1.0.0-rc.1<1.0.0"/"<") as $input |
    range($input | length-1) |
    "cmp_semver(\($input[.]|tojson);\t\($input[.+1]|tojson))\t-> "
    + "\(cmp_semver($input[.]; $input[.+1]))"'
cmp_semver("1.0.0-alpha";       "1.0.0-alpha.1")        -> -1                                                                         
cmp_semver("1.0.0-alpha.1";     "1.0.0-alpha.beta")     -> -1                                                                         
cmp_semver("1.0.0-alpha.beta";  "1.0.0-beta")   -> -1                                                                                 
cmp_semver("1.0.0-beta";        "1.0.0-beta.2") -> -1                                                                                 
cmp_semver("1.0.0-beta.2";      "1.0.0-beta.11")        -> -1                                                                         
cmp_semver("1.0.0-beta.11";     "1.0.0-rc.1")   -> -1                                                                                 
cmp_semver("1.0.0-rc.1";        "1.0.0")        -> -1                                                                                 

At first I wasn't sure if it was possible to use arrays as the semver key, but it appears it is possible, but requires some additional data points to sort on.

def semver_key: parse_semver | [
    .major, .minor, .patch,
    .prerelease==null, ((.prerelease//"")/"."|map(tonumber? //.)),
    .build==null, ((.build//"")/"."|map(tonumber? //.))
];

This allows you to sort by the versions.

$ jq -rn '
("1.0.0-alpha<1.0.0-alpha.1<1.0.0-alpha.beta<1.0.0-beta<1.0.0-beta.2<1.0.0-beta.11<1.0.0-rc.1<1.0.0"/"<") as $input |
[$input, ($input | sort_by(semver_key))] | transpose[] | "\(.[0])\t\(.[1])"
'
1.0.0-alpha     1.0.0-alpha                                                                                                           
1.0.0-alpha.1   1.0.0-alpha.1                                                                                                         
1.0.0-alpha.beta        1.0.0-alpha.beta                                                                                              
1.0.0-beta      1.0.0-beta                                                                                                            
1.0.0-beta.2    1.0.0-beta.2                                                                                                          
1.0.0-beta.11   1.0.0-beta.11                                                                                                         
1.0.0-rc.1      1.0.0-rc.1                                                                                                            
1.0.0   1.0.0                                                                                                                         

Assuming this works, this could simplify the cmp_semver/1 function greatly.

def cmp_semver2($other): semver_key as $a | ($other|semver_key) as $b |
    if $a == $b then 0
    elif $a > $b then 1
    else -1 end;
2
Philippe On

If you need to get all the objects where the version is "1.2.0", you can use string comparison :

jq --arg target 1.2.0 '
    map(select(.version == $target))' input.json

If you want the objects with version greater than 1.2.7, then:

jq --arg target 1.2.7 '
    def triple($i): $i | [splits("[.-]") | tonumber? // .];
    map(select(triple(.version) > triple($target)))' input.json
5
peak On

I know I could do some ugly regex hack here

You could also use a simple regex hack if the semver strings do not need to be checked for correctness. For example, consider the following:

# Ignore build metadata as it is irrelevant for comparisons:
# this means we can use == on the semver results to 
# determine equality of precedence.
# Note also that the following accords with the requirements
# about pre-release versions, notably:
# "Identifiers consisting of only digits are compared numerically."
# "Numeric identifiers always have lower precedence than non-numeric identifiers."
def semver:
  sub("\\+.*";"") | [scan("[^-.]+") | tonumber? // .];

Since a valid semver string always begins with three components none of which can have superfluous leading zeros, and since jq's built-in ordering is so friendly (as per the comment by @A.H.), semver as defined above should make it quite easy to compare valid semver strings. However, since the spec requires that "pre-release" < "release", some care is required for non-trivial semantic version strings:

# Compare two semver arrays ensuring in particular that:
# pre-release < release
# identifiers with letters or hyphens are compared lexically in ASCII sort order;
# numeric identifiers always have lower precedence than non-numeric identifiers.
def semver_less_than($array):
  if .[:3] == $array[:3] and length > 3 and ($array|length) == 3 then true
  else . < $array
  end;

The above is sufficient to pass the following sequence of tests included in the Semver 2.0 specification document:

1.0.0-alpha < 1.0.0-alpha.1 < 1.0.0-alpha.beta < 1.0.0-beta < 1.0.0-beta.2 < 1.0.0-beta.11 < 1.0.0-rc.1 < 1.0.0.

However, since your example only has trivial semver specs, you could get away with:

# Filter for finding .version greater than 1.2.7 
("1.2.7" | semver) as $v
| map(select( (.version|semver) > $v))

Caveat: It's quite likely the above needs some improvement. Tweaks or counterexamples would be welcome.