I have a table of Maven projects. Each project has a number of parameters and a version number.
When I select from the table I should only get the highest version, but due to the way maven versions look, this is tricky. This is the query I have so far:
select id, group_id as group, artifact_id as artifact
from (
select
p.group_id,
p.artifact_id,
p.id,
rank() over (partition by p.group_id, p.artifact_id order by p.version desc)
from projects p
) as ranked
where ranked.rank = 1
This does not give the highest version as the version does not follow a alphanumeric ordering.
The version format is described here.
The gist is that a version could be 1.2.3-SNAPSHOT
, where 1
(major), 2
(minor), 3
(incremental) are numbers and should be ordered as such and SNAPSHOT
(qualifier) is a string. If the version does not follow this format, it should be treated as a string.
Is this doable in PostgreSQL?
Parse the string. Like:
db<>fiddle here
substring(version, '^(\d+)')
... usesubstring()
with regular expression patterns to parse.^
... start of string()
... capturing parentheses\d
... class shorthand for digitssubstring(version, '^(\d+)')::int
... cast to integer to sort as numbermajor NULLS LAST
... versions without number go last (my assumption).minor NULLS FIRST
...2
comes before2.1
NULLS LAST
is the default insASC
(ascending) sort order inORDER BY
and can be omitted.You can use these expressions in
ORDER BY
directly (without adding them to theSELECT
list). Just to demonstrate.Advanced solution
For even more complex rules, you might want to use
regexp_matches()
- or, preferably,regexp_match()
in Postgres 10 or later:db<>fiddle here
This deals with all additional rules from your comment.
regexp_matches()
is a powerful tool, but you need basic understanding of regular expressions. Test if in doubt.Note in particular:
g
switch. More here.NULLIF(part[1],'')::int
... non-matches are listed as empty strings in the array. Need to be converted toNULL
before casting tointeger
Be aware of the difference between capturing
()
and non-capturing parentheses(:?)
. The former also report a match, while the latter does not.