Reorder a string using REGEXP_REPLACE in a Redshift table

1k views Asked by At

I am trying to replace a pattern in a Redshift table using regular expression. I have been trying with REGEXP_RELACE but no success so far.

My data, with columns name sequence and varchar data type, looks like:

1420311 > 1380566 > 1380566 > 9991380564  
1489773 > 9991489773  
1367309 > 1367309 > 9991367309

I would like to use REGEXP_RELACE (or any other function) in SQL Redshift to get the following result:

1420311 > 1380566 > 1380566 > 1380564 > 999
1489773 > 1489773 > 999
1367309 > 1367309 > 1367309 > 999

So that is finding the 999 sequence when it appears at the start of the string, and putting it last preceded by a > and maintain the remaining string.

Greatly appreciate any help!

2

There are 2 answers

1
Tim Biegeleisen On

If you just want a query which can generate this output then the following should work:

SELECT
    REGEXP_REPLACE(sequence, '999([0-9]{7})$', '$1 > 999')
FROM yourTable
1
Yunnosch On

Here is a solution (based on Tims), which will additionally

  • update the database content,
    UPDATE yourTable SET sequence = ... instead od SELECT ... FROM yourTable
  • find the "999" at the start of any member and tolerate whitespace before newline,
    no $
  • move it to the very end of the sequence,
    using ( > [0-9]{7}){0,} inside the 2nd capture group
  • find any leading group of digits breaking the 7digit rule, not only "999",
    using ([0-9]{1,}) instead of "999" and capturing it

Code:

UPDATE yourTable SET sequence =
    REGEXP_REPLACE(sequence, '([0-9]{1,})([0-9]{7}( > [0-9]{7}){0,})', '$2 > $1')