Split string with two delimiters and convert type

7.6k views Asked by At

I have a PL/pgSQL function like this (thanks to the guy who made this possible):

 CREATE OR REPLACE FUNCTION public.split_string(text, text)
    RETURNS SETOF text
    LANGUAGE plpgsql
    AS $function$
    DECLARE 
      pos int; 
      delim_length int := length($2);
    BEGIN
      WHILE $1 <> ''
      LOOP
        pos := strpos($1,$2);
        IF pos > 0 THEN
          RETURN NEXT substring($1 FROM 1 FOR pos - 1);
          $1 := substring($1 FROM pos + delim_length);
        ELSE
          RETURN NEXT $1;
          EXIT;
        END IF; 
      END LOOP;
      RETURN;
    END;
    $function$

It splits a string with a delimiter. Like this:

select * from split_string('3.584731 60.739211,3.590472 60.738030,3.592740 60.736220', ' ');

"3.584731"
"60.739211,3.590472"
"60.738030,3.592740"
"60.736220"

How can I save the results in a temp_array or temp_table. So I can get the the results in temp_x and split up these points again. Like:

"3.584731"
"60.739211"
"3.590472"
"60.738030"
"3.592740"
"60.736220"

and return the values as double precision. And all of this should be done in the function.

3

There are 3 answers

2
Erwin Brandstetter On BEST ANSWER

If you need the intermediary step:

SELECT string_to_table(a, ' ')::float8
       -- or do something else with the derived table
FROM   string_to_table('3.584731 60.739211,3.590472 60.738030', ',') a;

This is faster than regexp_split_to_table(). Regular expressions are powerful but expensive. Test with EXPLAIN ANALYZE.

I first split at ',', and next at ' ' - the reversed sequence of what you describe seems adequate.

If need be, you can wrap this into a PL/pgSQL function:

CREATE OR REPLACE FUNCTION public.split_string(_str text
                                             , _delim1 text = ','
                                             , _delim2 text = ' ')
  RETURNS SETOF float8
  LANGUAGE plpgsql IMMUTABLE PARALLEL SAFE AS
$func$
BEGIN
   RETURN QUERY
   SELECT string_to_table(a, _delim2)::float8
   FROM   string_to_table(_str, _delim1) a;
END
$func$;

Or as plain SQL function:

CREATE OR REPLACE FUNCTION public.split_string(_str text
                                             , _delim1 text = ','
                                             , _delim2 text = ' ')
  RETURNS SETOF float8
  LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
$func$
SELECT string_to_table(a, _delim2)::float8
FROM   string_to_table(_str, _delim1) a;
$func$;

Declare it IMMUTABLE (which it is) to allow performance optimization and other uses.
Declare it PARALLEL SAFE (which it is) to not disable parallel query plans.

Call (using the provided defaults for _delim1 and _delim2):

SELECT * FROM split_string('3.584731 60.739211,3.590472 60.738030');

Or with custom delimiters:

SELECT * FROM split_string('3.584731 60.739211,3.590472 60.738030', ',', ' ');

Fastest

For top performance, combine translate() with string_to_table(...):

SELECT string_to_table(
          translate('3.584731 60.739211,3.590472 60.738030', ' ', ',')
        , ','
          ))::float8;

In Postgres 13 or older use unnest(string_to_array()) instead of string_to_table().

2
vyegorov On

You do not need special functions, use built-in regexp_split_to_table:

SELECT *
  FROM regexp_split_to_table(
        '3.584731 60.739211,3.590472 60.738030,3.592740 60.736220',
        '[, ]') s;

EDIT: I don't see why you wish to stick with PL/pgSQL function if there's a built-in one.

Anyway, consider this example:

WITH s AS
(
    SELECT ' ,'::text sep,
           '3.584731 60.739211,3.590472 60.738030,3.592740 60.736220'::text str
 )
SELECT sep, left(sep,1), right(sep,-1),
       str,
       translate(str, right(sep,-1), left(sep,1))
  FROM s;

This means, that you can:

  1. do similar transformations before you call your function or
  2. integrate this code inside, but this will mean you'll need to introduce at least one extra variable, unless you feel comfortable replacing all $1 into translate($1, right($2,-1), left($2,1)) throughout the code. Obviously, plain $2 should be changed to left($2,1).
0
Pavel Stehule On

If I understand to your questions well, you can do:

-- store context to temp table
CREATE TEMP TABLE foo AS SELECT v::double precision FROM split_string('...') g(v);

-- store context to ARRAY
SELECT ARRAY(SELECT v::double precision FROM split_string('....') g(v))