Regexp_substr find string not matching a group of characters

1.8k views Asked by At

I have a string like mystr = 'value1~|~value2~|~ ... valuen". I need it as one column separated on rows like this:

value1
value2
...
valuen

I'm trying this

select regexp_substr(mystr, '[^(~\|~)]', 1 , lvl) from dual, (select level as lvl from dual connect by level <= 5);

The problem is that ~|~ is not treated as a group, if I add ~ to anywhere in the string it gets separated; also () are treated as separators.

Any help is highly appreciated! Thanks! ~|~

3

There are 3 answers

0
Gary_W On

This will parse the delimited list and the format of the regex will handle NULL list elements should they occur as shown in the example.

SQL> with tbl(str) as (
      select 'value1~|~value2~|~~|~value4' from dual
    )
    select regexp_substr(str, '(.*?)(~\|~|$)', 1, level, NULL, 1) parsed
    from tbl
    connect by level <= regexp_count(str, '~\|~')+1;

PARSED
--------------------------------
value1
value2

value4

SQL>
0
wolfrevokcats On

Quick and dirty solution:

with t as (
select rtrim(regexp_substr('value1~|~value2~|~value3~|~value4', '(.+?)($|~\|~)', 1,level,''),'~|~')value  from dual connect by level<10
) select * from t where value is not null;
0
MT0 On

[] signifies a single character match and [^] signifies a single character that does not match any of the contained characters.

So [^(~\|~)] will match any one character that is not ( or ~ or \ or | or ~ (again) or ).

What you want is a match that is terminated by your separator:

SELECT REGEXP_SUBSTR(
         mystr,
         '(.*?)(~\|~)',
         1,
         LEVEL,
         NULL,
         1
       )
FROM   DUAL
CONNECT BY LEVEL < REGEXP_COUNT( mystr, '(.*?)(~\|~)' );

(or if you cannot have zero-width matches, you can use the regular expression '(.+?)(~\|~)' and <= in the CONNECT BY clause.)